From: Anthony Williams (anthony_w.geo_at_[hidden])
Date: 2006-12-01 08:46:16
Roland Schwarz <roland.schwarz_at_[hidden]> writes:
> Anthony Williams wrote:
>> Roland Schwarz <roland.schwarz_at_[hidden]> writes:
>>> In particular in
>>> presence of multiple processors. I.e. an atomic lib is primarily about
>> Not just about performance. It also enables the construction of the
>> higher-level primitives.
> As you might know, this was the route I am following. But the primitives
> are not necessarily exposed to the user. To be more precise: From a user
> perspective an atomic lib is primarily about performance. Better?
Maybe. I'm not sure.
>> I think that the memory barrier and acquire/release semantics are just two
>> ways of talking about the same thing.
> This is a point where I am still confused about. acquire/release are
> "one way" ordering constraints while memory barriers are "both way".
acquire/release gets me in a twist. Re-reading my refs, you're right.
> This is as I understand it:
> *) memory barriers are primitives which have no other effect as to
> order memory access, i.e. they do not store or load anything by themselves.
> Also they affect only memory operations and have no effect on others.
> Also they should affect the compiler, as to disallow reordering across
> the barrier.
> There are 3: read_mb, write_mb and full_mb.
> read_mb orders read access, i.e. no read from before may be moved after
> the barrier and no read from after may be moved before the barrier.
> write_mb does the same for writes. full_mb disallows moving any read or
> write across the barrier, and so establishes total order.
> *) acquire/release semantics on the other hand establish a
> conceptional different model of ordering. acquire disallows moving any
> access (read/write) from after the primitive to occur before it, but
> still allows accesses from before to occur after it. This is kind of a
> one-way sign for accesses.
> release semantics is the other way round.
> Also acquire/release is bound to an operation kind of an attribute of
> the operation, while memory barriers are operations on their own.
Yes. Acquire semantics only tend to apply to read operations, and Release
semantics to write ops.
> So as I currently understand it, these two concepts are about the same
> issue, but are neither orthogonal nor can one be used to synthesize the
> other. I would be glad to be proved wrong.
You're right. I was getting confused.
> Another observation: release/acquire semantics is closer to mutex
> behavior, since no harm is done when an operation from before mutex
> acquisition is moved inside the critical section. A barrier would not
> allow such an operation. True?
>> As I understand it, on x86, the SFENCE instruction is a "Store Fence", which
>> is a "Write Barrier", and has "Release Semantics". Any store instructions
>> which happen before it on this CPU are made globally visible afterwards. No
>> stores instructions which occur afterwards on this CPU are permitted to be
>> globally visible beforehand.
> This looks to me as it is possible in the acquire/release model to
> separate the operation from the "fence". Or viewed in the other
> direction, it is possible to optimize by attaching the (otherwise
> separate) fence to a instruction to save some cpu cycles. True?
This SFENCE instruction is just a store fence. Some other (non-fence)
instructions also have fence-like properties, but they tend to be full
>> Again on x86, the LFENCE instruction is a "Load Fence", which is a "Read
>> Barrier", and has "Acquire Semantics". Any read instructions which happen
>> before it on this CPU must have already completed afterwards.
> Are you really sure about this one?
Yes, apart from the "acquire semantics". The intel spec says:
"Performs a serializing operation on all load-from-memory instructions
that were issued prior the LFENCE instruction. This serializing operation
guarantees that every load instruction that precedes in program order the
LFENCE instruction is globally visible before any load instruction that
follows the LFENCE instruction is globally visible."
>> No loads
>> instructions which occur afterwards on this CPU are permitted to be executed
> This part of the statement makes sense to me.
> I omitted the rest of your post, since I think it depends on the
> acquire/release versus memory barriers getting clarified first.
>> The details of the memory model, atomics, and visibility, and how it applies
>> to C++, are under discussion amongst C++ standards committee members. I would
>> imagine that you'd be welcome to join such discussions.
> Hmm, not sure how I could join other than posting to some lists. Do you
> mean comp.lang.c++.moderated?
No. There is a cpp-threads mailing list, and the C++ Standards committee
Peter Dimov told me how to get on cpp-threads.
Ask your national body (Here in the UK it's BSI, in Germany it's DIN, and in
the USA it's ANSI) about joining their C++ panel, and speak to Andy Koenig
about getting added to the committee reflectors. If your national body is
unhelpful, speak to Lois Goldthwaite (standards_at_[hidden]), and she'll probably
let you join the BSI panel.
-- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk