Boost logo

Boost :

Subject: Re: [boost] [lockfree] Review
From: Grund, Holger (Holger.Grund_at_[hidden])
Date: 2011-08-03 03:39:21


> > Efficient loads & stores are a bit tricky in that SSE2 is not a
> > requirement for 32-bit Windows. Without it, I think we need to resort
> > FILD/FISTP, which is a pain.
>
> iirc, sse2 intrinsics are not guaranteed to be atomic, so sometimes
> memory access has to be emulated via CAS.
>
All aligned 64-bit accesses are guaranteed to be atomic on x86. The same is not true for 128-bit load and stores on x64 (at least there are no architectural guarantees -- I think most (all) Intel & AMD implementations still did in 2009)

I'm not really sure how you would implement a fully correct lock-free atomic<int128_t> on x64. A cmpxchg16b requires the underlying page to be writable.

A very silly example would be:
const atomic<int128_t> x = 0; // read-only pages
void foo() { int128_t l = x; } // how does this load work? CMPXCHG16B would result in an access violation

In reality, 128-bit atomic reads from shared memory might be interesting. If you only have a read mapping I don't think there is any way to read 128 bits atomically.

> > Is x aligned, here? I don't recall the ABI, but I believe it doesn't
> > guarantee anything beyond 4-byte alignment for ESP on entry. So to
> align x
> > properly in the stack frame, the stack must be dynamically aligned
> (or
> > some interprocedural optimization may help) -- but I don't think
> older
> > GCCs do that.
>
> dynamic memory allocation makes it even worse, because you can use
> placement new
> to put the data structure to virtually any memory location :/
>
Well, I would say if you do a placement new it's your responsibility to ensure proper alignment. If you don't, all bets are off -- or "undefined behavior" :-)

Most implementations have a operator new/malloc implementation that always returns aligned memory. The tricky part about the x86 ABIs, is that they generally only require the stack to be 4-byte aligned. Many implementations therefore fail to get alignment of autos right.

E.g.:
uint64_t x0; // static storage duration, ok
uint64_t* x1 = new uint64_t; // dynamic storage duration, ok
char c[8];
uint64_t* x2 = new (c) uint64_t; // may not be aligned, but that's undefined behavior anyway
void foo() { uint64_t x3; } // may not be aligned

I think, the last case is where compilers fail to implement the requirements correctly.

Anyway, that's really about Boost.Atomic. Let's see if we can get some traction on it.

Thanks!
-hg

--------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk