Boost logo

Boost :

Subject: Re: [boost] [lockfree::fifo] Review
From: Helge Bahmann (hcb_at_[hidden])
Date: 2009-12-10 11:36:22


On Wed, 9 Dec 2009, Anthony Williams wrote:

>> So far I have not yet met any 64-bit platform that provides 64 bits of
>> (user-space!) address space (the same cannot be said of 32-bit
>> platforms), so I would say that any use of spare bits on 64 bit is
>> safe (for user-space!).
>
> As Chris also noted, I would be wary of using them for ABA checking
> though: it's easy to overflow a 16-bit counter quite quickly, and that's
> all you've got even if the OS only uses 48 of your 64 bits for address
> space.

This concern is certainly valid for basically any algorithm that uses
"generation counters" or similar constructs for disambiguation (it could
be argued that even 32 bits may not be enough). Using spare bits at all
should however be considered a valid technique.

>> FWIW there are other ways to avoid ABA than tagging pointers, and
>> without DCAS on 32-bit, see e.g. this sample implementation of a
>> lock-free stack:
>>
>> http://www.chaoticmind.net/~hcb/projects/boost.atomic/stack.hpp
>
> Yup. Incidentally, you don't even need to count the nodes in push,
> provided you're careful when reclaiming nodes in pop.

hm, I'm not sure what you mean, I don't think I am counting anything but
the number of threads in the critical section?

> The downside of this technique is that if the traffic is too heavy then
> the ref count never reaches zero, so the nodes can never be reclaimed.

Right, and although I would argue that such a highly contended hotspot is
a design flaw in itself, there are ways to do reliable reclaim even in
that case (e.g. that allow reclaiming an element after the last thread
that could actually have "seen" it leaves the critical section -- can be
done with checkpoints similar RCU, so it can even be more efficient than
the atomic counter I used), yet I am not sure this is really worth it -- I
would like to make an entirely different point that has been nagging me:

I'm not certain that portable, generally useful "symmetric" [1] wait-free
(or obstruction-free) data structures for unbounded numbers of threads are
both a) feasible and b) required. As to a) implementations overwhelmingly
end up being more expensive than locking (as well as less deterministic
than locking with priority inheritance), and there is also some
theoretical basis for suspecting this to be an intrinsic property [2] [3].
As to b), the cases when I have needed and successfully used
wait-/obstruction-free data structures are roughly:

- bounded numbers of threads with static roles (e.g. producer/consumer
with single reader/multiple writer or vice versa)

- strongly asymmetric access patterns (e.g. number of read accesses vastly
outweighs number of write accesses; or the converse: a producer in
"interrupt" context, with many consumers in the slow path)

In the former case, often all operations can/must be
wait-/obstruction-free. In the latter case the goal is just to make a
subset of the operations "less expensive" or wait/obstruction-free, while
the other subset typically becomes "more expensive". Both cases are easier
to address, cover a large number of use cases and still pay off
tremendously.

Personally I would therefore consider less generic data structures more
useful, e.g. hash-tables/trees with wait-free read-side path, or fast
fully wait-free 1:n and n:1 producer/consumer data structures -- instead
of not very portable (DCAS), quite expensive (compared to locking) n:m p/c
data structures that strive to be egg-laying woolly dairy pigs. (I don't
consider my stack implementation terribly useful, for that matter).

Helge

[1] by "symmetric" I mean: all types of accesses (e.g. both insertion and
deletion) are wait/obstruction-free

[2] http://portal.acm.org/citation.cfm?id=1146427

[3] http://portal.acm.org/citation.cfm?id=197975


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk