Boost logo

Boost :

Subject: [boost] [atomic] [review] Atomic Review
From: Stewart, Robert (Robert.Stewart_at_[hidden])
Date: 2011-10-24 15:00:25

Boost.Atomic should be included in Boost.

- What is your evaluation of the potential usefulness of the library?

Providing C++03 support for the new atomics functionality will prove immensely useful. It would be even more useful if it could be provided as a portability layer for environments with both C++03 and C++11 compilers. That is, if one could use boost::atomic to get Helge's implementation on C++03 compilers and std::atomic on conforming C++11 compilers, without needing to change code, this library would be ideal. That might require changes to the current implementation, but seems unlikely to break code using the current release.

- What is your evaluation of the design?

The design is largely fixed since it is standardized.

I was put off initially that the names appear in namespace boost rather than, say, boost::atomics, but since this is intended to provide *the* Boost implementation of C++11 functionality, I don't expect there to be any competitors. That said, s/boost::atomic/std::atomic/ is no simpler a change than s/boost::atomics::atomic/std::atomic/ to make in the future.

- What is your evaluation of the implementation?

I'll leave most comments on this to those with more experience. However, I'm positive that any issues will be addressed quickly as more eyes are drawn to the implementation, so even if there are present issues, they should not preclude acceptance.

I think the fallbacks, in lines 22-77 of atomic.hpp, would be better in a separate file such as boost/atomic/fallbacks.hpp or even just at the end of boost/atomic/platform.hpp. It would remove clutter from atomic.hpp.

I would expect a different conditional compilation structure in platform.hpp would be easier to follow and maintain:

   // Compiling with GCC
   #if defined __GNUC__
   # if defined __i386__ || defined __x86_64
   # include <boost/atomic/detail/gcc-x86.hpp>
   // etc.
   # else
   # include <boost/atomic/detail/gcc-cas.hpp>
   # endif

   // How about ARM on Linux?
   #elif defined __linux__ && defined __arm__
   # include <boost/atomic/detail/linux-arm.hpp>

   // Use Windows Interlocked* APIs?
   #elif defined BOOST_USE_WINDOWS_H || defined _WIN32_CE...
   # include <boost/atomic/detail/interlocked.hpp>


The directory layout is a little odd. I'd have expected boost/atomic/platform.hpp to be boost/atomic/detail/platform.hpp. That, of course, leaves nothing in boost/atomic. However, if namespace atomics is added, then boost/atomic.hpp, if retained, can just include boost/atomics/atomic.hpp, the latter containing the content presently in boost/atomic.hpp.

- What is your evaluation of the documentation?

As usual, I have numerous detailed comments which I've addressed at the end.

The documentation should include some links to suitable background material for those not "already...familiar with concurrency in general, as well as elementary concepts such as 'mutual exclusion'".

Information on the section(s) of the C++11 standard that are and aren't part of the library would be beneficial.

Generally, the explanations assume too much familiarity with the material with no pointers to resources to bridge the gap. I've given one, specific suggestion in my detailed comments below on what I'd like to see in the Usage examples section. It may be that more thorough descriptions are the province of books and that it is fair for Boost.Atomic to assume a good deal -- as does the C++11 standard. Nevertheless, I'd prefer to see more in the documentation.

I was hoping that Boost.Atomic would provide a bridge to C++11 by engaging platform-supplied atomic support when available so that I wouldn't have to do that work. However, since that is not available, the documentation should make that clear.

- Did you try to use the library? With what compiler? On which platform?
  Did you have any problems?

I wish I could say yes, but the team I was hoping to get to use the library opted for a safer, more expedient solution to their present issues than adopting a new library. I haven't had an independent opportunity.

- How much effort did you put into your evaluation? A glance? A quick
  reading? In-depth study?

I spent a lot of time reviewing the documentation and a little looking over the code.

- Are you knowledgeable about the problem domain?

Sadly, I'm not nearly as knowledgeable as I should be. I'm working on that and Helge has actually been very helpful in answering specific issues I've had with other's code, so I'm comfortable trusting his knowledge.

Detailed Documentation Comments


Presenting Boost.Atomic

   s/these data types/them/


The following change avoids the non-word "sequentialized" and should prove clearer:

   s/each operation behaves as if it were strictly sequentialized/the operation is executed as if no other thread were attempting to do the same/

The bulleted list would read better as follows. I eliminated, "custom coordination protocols", because that confuses rather than helps.

   "Atomic variables are useful as:

    - a means to coordinate multiple threads
    - faster alternatives to lock-based access to simple

Thread coordination using Boost.Atomic

   s/accesses of threads to shared variables/threads accessing shared variables/
   s/the cache hierarchies /cache hierarchies/
   s/above one/above/

Presumably, a "fully formalized definition" of "happens-before" is in the C++11 standard, so the reader should be referred to the appropriate section thereof.

The structure of this page seems broken. I'd expect the following:

   Coordination using happens-before
      Using mutual exclusion
      Using release and acquire
      Using fences
   Coordination of dependent data access
   Coordination using sequential consistency

Enforcing happens-before through mutual exclusion

   s/If this is be thread1/If thread1 succeeds first/
   s/Obviously cannot state/There is no way to determine/
   s/either one/either option/

happens-before through release and acquire

   s/the value this value/the value/
   s/the same atomic/the *same* atomic/
   s/avenues for/orders of/
   s/hold: thread2/hold because thread2/

The presentation of the possible outcomes is not as clear as it could be. I suggest something like the following:

"There are two possible orders in which the atomic operations can occur:

"1. thread1 stores a new value in a before thread2 loads a's value. Since a has been incremented to 1, thread2 will execute B. All of the listed criteria are satisfied, so 'A _happens-before_ B' holds.

"2. thread2 loads a's value before thread1 stores a new value. Given that a is initially zero, thread2 will execute C. Since thread1 did not first store a new value in a with release semantics, the criteria listed above are not satisfied, so 'A _happens-before_ C' does *not* hold."


The phrasing in the first paragraph is a bit less clear than it could be. Try the following:

" conjunction with preceding (for acquire or consume), succeeding (for release), or both (for seq_cst) atomic operations."

The use of memory_order_relaxed, in the example, should be called out.

   s/in the case C is executed/when C is executed/

happens-before through release and consume

   s/uses release and consume/uses release and consume semantics/
   s/coordination: If/coordination. If/
   s/the value this value/the value/
   s/the same atomic/the *same* atomic/
   s/value of the atomic variable/value of that atomic variable/
   s/avenues for/orders of/
   s/hold: thread2/hold because thread2/
   s/accesses (presumable writes) to/accesses of (presumably writes to)/
   s/accesses (presumably reads) to/accesses of (presumably reads from)/
   s/by thread2: Lacking/by thread2. Lacking/
   s/Note that in this example, the fact that/Note that/

The following would be clearer for the explanation:

"There are two possible orders in which the atomic operations can occur:

"1. thread1 stores a new value in a before thread2 loads a's value. In this case, since a has been set to 1, thread2 will use 1 as the value of index to execute B. All of the listed criteria are satisfied, so 'A _happens-before_ B' holds.

"2. thread2 loads a's value before thread1 stores a new value. In this case, thread2 will execute B with index set to 0, which is not the value (yet to be) set by thread1. Since thread1 did not first store a new value in a with release semantics, the criteria listed above are not satisfied, so 'A _happens-before_ B' does *not* hold."

Your statement that the second example in this section is erroneous lacks an explanation. Presumably, the problem stems from the conditional branch, but an explanation is needed.

Sequential consistency

   s/coordination: If/coordination. If/

This section deserves example code and an explanation:

   ...; /* A */
   ...; /* B */

   ...; /* C */
   ...; /* D */

"There are two possible orders of execution of that code:

"1. thread1 assigns to a before thread2 assigns to b. A _happens-before_ D. Whether A _happens-before_ C is not known.

"2. thread2 assigns to b before thread1 assigns to a. C _happens-before_ B. Whether C _happens-before_ A is not known.

"The fences could have been operations on the same or different atomic variables. All that matters is that memory_order_seq_cst be the memory order in each case."

Programming interfaces

This section would be better formatted as in, for example,

Memory order

The table formatting is confusing, at least without any borders between the cells. Add internal borders or align the constants to the top of their cell. Using a table may be faulty anyway. It would probably be simpler to define each in a separate paragraph.

   s/informally speaking/informally/

memory_order_relaxed: The description is hard to read. This should be better: "Informally, operations following the atomic operation may be ordered before it and operations preceding it may be ordered after it. This constraint is only suitable in two cases: when further operations do not depend upon the outcome of the atomic operation or ordering is enforced separately by atomic_thread_fence operations." (Notice that the final period is missing in your version.)

memory_order_release: Rephrasing would add a bit of clarity: "Informally, all memory operations preceding the atomic operation may be reordered after, but not before, it."

memory_order_acquire: s/succeeding/following/ (for consistency). Rephrasing would be helpful: "Informally, all memory operations following the atomic operation may be reordered before, but not after, it."

memory_order_consume: s/restrictive/limited/. Rephrasing: "Similar to memory_order_acquire, except that only memory operations following the atomic operation that are _computationally-dependent_ (this should link to a definition!) on the value retrieved from the atomic variable, are order restricted."

memory_order_acq_rel: s/operation/operations./

memory_order_seq_cst: s/additional enforces/additionally enforces a/, s/such qualified/so qualified/

   s/section happens-before for explanation/happens-before for explanations/

Atomic objects

   s/template class/class template/

boost::atomic<T> template class

   s/objects supports/objects support/
   s/new value to atomic variable/new value/

The descriptions are weak, since they don't refer to the order argument(s). Describing the three- versus four-argument compare_exchange function differences in a subsequent paragraph is less helpful. A footnote reference, with a link, to that explanation would be fine, however. Also note that the success/failure ordering behavior deserves more explanation. Why is it useful or important? Who should care?

Even if you think the assignment and conversion operators should be avoided, they should be documented. What's more, I disagree with your rationale. Those operators imply memory_order_seq_cst. If that's what a user wants -- and that's usually the case -- why not use the syntactic sugar? You can state your case as, "When specifying memory ordering, other than memory_order_seq_cst, for some atomic operations, specify them for all, rather than using the assignment and conversion operators, to make all of the operations clear."

No mention of the default for the order argument.

boost::atomic<integral> template class

The descriptions are inconsistent with those of the previous section. For example, a consistent description for fetch_add() would be, "Add v to current value, returning current value."

Bitwise operations are typically capitalized. Thus, "bitwise AND".

   s/always has memory_order_seq_cst as default parameter/is defaulted to memory_order_seq_cst in each case/

Same argument regarding the operators as for the previous section.

boost::atomic<pointer> template class

   s/all atomic object/boost::atomic<integral> objects/

The descriptions are inconsistent with those of the initial section.

   s/always has memory_order_seq_cst as default parameter/is defaulted to memory_order_seq_cst in each case/

Same argument regarding the operators as for the previous section.

Feature testing macros

   s/detection whether/detection of whether/
   s/operations, or whether/operations or whether/
   s/signed\/unsigned/unsigned/ (except for BOOST_ATOMIC_CHAR_LOCK_FREE)

Usage examples

There should be an introductory paragraph for this section. Try the following:

"This section shows non-trivial examples of using Boost.Atomic to implement common mechanisms using atomics. These might be considered advanced tutorials or, perhaps, reference implementations."

Reference counting

You should reference some external source to discuss the idea behind and uses of reference counting, such as

   s/reference counter/reference count/

You should clarify what you mean by, "passing an existing reference from one thread to another must already provide any required synchronization." I presume you mean that the old thread must not release its reference count on the object, so it won't be released, before the new thread can increment the reference count.

   s/must obviously happened/obviously must have happened/

You don't explain why the acquire fence is needed. I started to write why the release semantics were needed for the decrement, but found I couldn't figure out how to say it, which means I'm not entirely sure I understand it either. Every time I think I get the details of these things, I find that I'm somehow not quite clear. Thus, your explanation needs to be clearer. However, after thinking things over at some length, I think I have it and will give the explanation a try anyway:

"If a thread is trying to increment the reference count, there's no contention. The increment is atomic and doesn't need to impose ordering on intrusive_ptr_release(), because there must be a valid, existing reference, thereby ensuring the reference count will be at least zero, regardless of the order between intrusive_ptr_release() and intrusive_ptr_add_ref() calls. That's why intrusive_ptr_add_ref() uses memory_order_relaxed.

"If a thread is trying to decrement the reference count, it might be decrementing the value to something greater than zero. In that case, all that's important is that the decrement be atomic to prevent races between threads. However, if the reference count is decremented to zero, then no other thread can have a valid, outstanding reference. That implies that there can be no race in comparing the value returned by fetch_sub() (which returns the original value) with 1. If fetch_sub() returns 1 (meaning the reference count is now zero), then, and only then, must x be deleted. Thus, a memory_order_acquire fence is used to ensure that the delete expression is not reordered ahead of the fetch_sub() call."

That sort of analysis, in which each of the various execution orders and use cases are examined and discussed, is necessary to grasp this stuff. Perhaps you'd rather that be left to book authors, but I don't think it's too much to ask in this context.


You should reference external information on spinlocks to guide readers to their appropriate uses, particularly in light of your warning to understand the consequences of their use.

   s/"happens before"/_happens-before_/

This example deserves a good explanation of the busy-wait loop and why acquire and release semantics are used in each case.

Last paragraph: punctuation and wording problems. I suggest this instead:

"This 'optimization' is actually useless and may be harmful. Since threads can be spinning busily on a locked spinlock, it doesn't matter whether CPU cycles are wasted with 'exchange' and 'acquire' operations or with just 'exchange' operations. Furthermore, a tight 'exchange' loop, without any memory-synchronizing instruction introduced by an 'acquire' operation will, on some systems, monopolize the memory subsystem which degrades the performance of other activities."

Singleton with double-checked locking pattern

This section should refer to external references on the Singleton Pattern and the Double-checked Locking Pattern, particularly with details on the problems with the DCLP.

1st para, 2nd sentence: change to read, "instance must ensure that creating the instance in one thread _happens-before_ any dereferences of the pointer in other threads." That avoids referring to "happens after" which you've not discussed in the documentation.

The explanation of the implementation should be more detailed.

Wait-free ring buffer

   s/one single/one/ or s/one single/a single/
   s/without any locks/without locks/
   s/"wait-free" which/"wait-free", which/

The code requires a great deal more explanation. Each ordering constraint should be discussed. Places where others have made mistakes might be worth mentioning as "anti-examples".

Avoid the use of "happens after" since you don't discuss it anywhere.

Wait-free ring buffer

   s/"happens before"/_happens-before_/

The code requires a great deal more explanation. Each ordering constraint should be discussed. Places where others have made mistakes might be worth mentioning as "anti-examples".


   s/Advise/Advice/ or s/Advise/Recommendation/

In the second bullet, you assume that the reader understands "computation-dependency" and "control-dependency" but I doubt that fits within the expected knowledge claimed in the Introduction.


Unit tests

"For example" and, consequently (to my mind, anyway) "e.g." should be set off by a comma. (This probably applies elsewhere.)

   s/behave atomic as appropriate/behave atomically as expected/

Rob Stewart robert.stewart_at_[hidden]
Software Engineer using std::disclaimer;
Dev Tools & Components
Susquehanna International Group, LLP


IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Boost list run by bdawes at, gregod at, cpdaniel at, john at