Subject: Re: [boost] [lockfree] Review
From: Tim Blechmann (tim_at_[hidden])
Date: 2011-07-30 13:59:16
thanks for the review!
> The library seems well-designed. As I believe has been discussed in a
> separate thread, naming is somewhat inconsistent. The failure modes are
> not clear to me (e.g. does enqueue/push return false or throw an exception
> when there is no more memory), but overall I think Tim did an excellent
have already improved the documentation of these functions, but i will check
their wording again.
> As mentioned, it would appear that bundled atomic implementation has a lot
> of shortcomings, but one is particularly bad: atomic<tagged_ptr> is not
> actually lock-free for Visual C++. The DCAS implementation should be
> straightforward, but seems to be absent . in case you wonder I did an
> initial implementation of <(cstd)atomic> and added all required intrinsics
> for a simple implementation to the compiler. These made it all into
> VC++2010 RTM (IIRC, there was only one 64-bit, most where 8 & 16-bit
i don't have access to any msvc compiler, do you think you could contribute the
msvc-specific code for boost.atomic?
besides, i have a version that only requires 32bit cas.
> There are a few points that make me a bit nervous. Specifically, the
> alignment story is a bit weak. It seems there is only one use of
> BOOST_LOCKFREE_CACHELINE_ALIGNMENT and that's for fifo<>::node. Older
> compilers (GCCs and VC at least) don't dynamically align the stack with
> __declspec(align)/__attribute__((aligned). Other classes do not use
> alignment at all.
i don't know of any standard-compliant c++ way to enforce a specific alignment.
it doesn't affect the correct, but is merely a hint for the compiler
> It seems somewhat inconsequent to add internal padding
> to move members to different cachelines, but at the same time not ensuring
> alignment (which means potential cacheline overlaps at back and front of
> the object). Also the code seems to assume that compilers will use an
> empty base class optimization for noncopyable -- I guess, that's fine for
> most major compilers.
is there any current compile, that doesn't optimize empty base classes? i read
in a document from the early 1990s, that most recent compilers do an empty base
> Also it is not clear how it is ensured that nodes allocated are actually
> aligned. Allocators can and usually do ignore alignment requirements
> greater than intmax_t types. Some may just fail to instantiate.
the cache line alignment is again an optimization, that should work reasonably
well. it shouldn't affect the correctness, though.
> likely/unlikely are often defined as macros as in the Linux kernel. People
> probably shouldn't do that for C++, but I have seen it quite a bit. It
> probably doesn't help that many compilers' inlining oracles don't do very
> well for modern code.
why? often the hints are generated by profile-guided optimization, but very few
build systems actually make use of them.
> The name freelist_t is somewhat confusing for a template parameter and
> while I'm not sure what harm it could do it should probably follow the
> general convention of CamelCase.
already replaced with boost.parameter-style policies
> The definition of wait-free and lock-free is a bit weird. Specifically, the
> term concurrent operations is a bit vague. I think it's easier to grasp if
> you say something along the lines of at least one thread makes forward
> progress at any given point in time. With the current definition in the
> docs (and most others I know of), it may not be immediately clear why a
> mutex-based implementation wouldn't be lock-free.
the wait-free and lock-free definitions are almost exactly taken from
herlihy&shavit. i can try to explain them a bit more..
> There's a paragraph:
> > The synchronization is done completely in user-space with no interaction
> > without any direct interaction with the operating system  . This
> > implies that they are not prone to issues like priority inversion (a
> > low-priority thread needs to wait for a high-priority thread).
> I think there's some duplication in the first sentence. I'm not sure what
> lock-free has to do with priority inversion and I would think that
> priority inversion means priorities are -- well, inverted. I.e. a
> high-priority thread is not scheduled, because a lower priority thread
> owns a resource needed by the high-priority thread.
well, but with lock-free programming, no thread owns a resource.
> This also makes it sound a bit like lock-free is superior to OS mechanisms,
> but that really depends. A standard lock-free list is not fair for
> writers, but an implementation making use of OS facilities can be. On the
> other hand ticket lockets can provide some fairness.
it is definitely not the aim to provide a general-purpose concurrent data
structure, which is fair and efficient, but to provide data structures that are,
as the name states, lock-free. ;)
the section about the performance of lock-free data structures already discusses
some performance-related aspects, but apparently this is not clear enough.
> fifo<> and ringbuffer::empty says it is not thread-safe. I would assume
> that means the value returned is potentially stale, but it can't break
> class invariants, or can it?
already improved this part.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk