Boost logo

Boost :

Subject: Re: [boost] GSOC 2015 : Project on Concurrent Hash Tables
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2015-02-25 09:27:08


On 25 Feb 2015 at 4:57, Rob Stewart wrote:

> Why allocate? Just ensure old_map is destructible or assignable. Also
> note that new can throw which would violate your noexcept declaration.

Firstly Rob thanks for commenting in such detail.

Anyone interested may be curious as to the concurrent_unordered_map
algorithm, so I'll describe that here as it should completely
transform what would be a correct vs an incorrect move and copy
implementation.

concurrent_unordered_map keeps an array of bucket_type in an atomic
pointer _buckets. Each bucket_type has its own tristate spin lock
where 0 is unlocked, 1 is locked, 2 is "this bucket is invalid and
has been replaced, please go reload the _buckets pointer".

Inserting an item looks up the bucket and locks its spin lock. It
does a linear scan of the array of items in the bucket, looking for
empty slots. If it finds an empty slot, it fills it. If it does not
find an empty slot, it extends the array in the buckets. Erasure is
very similar. Note that inserting and erasing does NOT hold the
rehash lock.

Rehashing is pretty much the only operation which can't run
concurrently to any other, so it gets a rehash lock to enforce rehash
serialisation as doing two rehashes at once could not work well.
Rehashing involves creating a new bucket_type array and marking each
of its buckets as locked. We replace the base _buckets pointer with
the new array. We now mark all the buckets in the old bucket_type
array as having state 2 (reload buckets list). We now copy the hash
and pointer to every item in the old buckets array into the new
buckets array. If an exception is thrown at any point, we can always
restore the previous buckets list, which is untouched. If we succeed
in completely transferring all the items, we remove all items from
the old buckets array to reduce its memory consumption to minimum,
and append the old buckets array to a ring buffer so that it hangs
around for a while in memory. This is because other threads may still
be inbetween the load of the buckets array and the first inspection
of the bucket lock, and therefore deleting the buckets array
immediately would delete the storage out from underneath them.

As Rob mentioned, swap(), operator=(&&) and the move constructor are
all closely related, and in many C++ classes one implements
operator=(&&) entirely in terms of swap() and the move constructor.
Some C++ classes can additionally implement swap() entirely in terms
of move constructors, so in fact one can reduce everything to just a
move constructor implementation.

However, under concurrency this may or may not be ideal. I know how
I'd do it, but the test here for the student is to demonstrate just
how well they understand concurrent C++ programming.

I would say Amarnath that I would be surprised if your posted
solution would pass when tested with valgrind, and I would doubt it
would pass with the thread sanitiser. I'd urge you to think about
test code which when combined with valgrind and the thread sanitiser
really demonstrates the correctness of your implementation under any
CPU load situation. Think in terms of threads taking _seconds_ to
realise that a bucket list is out of date, and you're going the right
way.

And keep going, you're doing well!

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk