I know that on x86 and x64 architectures, assigning to or reading from any location up to the native general-purpose register size is atomic if the item in question is properly aligned for its size.

That is, { a=b; in parallel with b=c; } will not read a partially-changed b, if b is declared in the normal manner. Reading oddly-aligned things out of a packed stream, or using pragmas to change the alignment options, may upset this.

Furthermore, the cache is coherent among multiple CPUs or cores, even on a NUMA server. What you have to watch out for is when the compile actually issues the read or write, since it can us a register and save it back to memory much later, or re-arrange the requests. Furthermore, the chip queues requests to memory with reads having priority, so a write followed by a read needs special consideration.

Assigning to a (non-volatile) char might do something “interesting”. For example, if two separate char variables are declared, the compiler might keep them in registers and save them both out at the end with a single 16-bit write. The x86/x64 instruction set is conducive to that, but not to other cases. But in general an architecture might indeed merge separate variables to single larger register. The compiler might then re-save something that didn’t change, thus clobbering a change made on another thread.

The current C++ standard does not address threads, so there is indeed no portable way to guarantee that. You have to encapsulate and implement for each architecture, and use compiler-specific extensions. It would be interesting to see a list of architectures noting whether or not primitive type reads and writes are atomic, at least verifying that they are even if there is nothing listed that doesn’t. I’m sure they would all have footnotes to that, as I described above for the one I’m familiar with.

--John

From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Kevin Kassil
Sent: Friday, September 04, 2009 9:32 AM
To: boost-users@lists.boost.org
Subject: Re: [Boost-users] shared_ptr and weak_ptr concurrency

Stefan,

On Thu, Sep 3, 2009 at 2:30 AM, Stefan Strasser <strasser@uni-bremen.de> wrote:

...

why would you even need a lock here?
the shared_ptr doc says that you can expect the same thread safety from
shared_ptr as you can from built-in types.
you can use multiple-readers-single-writer without any locks on built-in
types.

You can? Is assigning to a char or a double guaranteed to be atomic? How can the compiler guarantee that? -- What if there is some architecture for which it's not a single instruction assignment?

Kevi