Boost logo

Boost :

From: Corrado Zoccolo (czoccolo_at_[hidden])
Date: 2007-09-12 15:13:00


On 9/9/07, Kim Barrett <kab_at_[hidden]> wrote:
> At 11:11 AM +0200 9/9/07, Corrado Zoccolo wrote:
[...]
> > Do you see any drawback in changing the access to the counter to a
> > simple volatile access, at least when the platform is known to be an
> > IA32?
>
> Don't do that. It won't work properly on a multi-processor
> system. Memory barriers are needed to ensure correct operation on such
> systems, and gcc (x86) does not generate a memory barrier for a
> volatile load.

The problem is what guarantee you can have when reading the value of a
shared counter that can be asynchronously modified by other threads.
In my use case, related to the implementation of copy-on-write
behaviour of a smart-pointer-like class, I have the guarantee that if
the ref_count is 1, then I'm sure that only one thread owns the smart
ptr.
This is the only guarantee I need to implement COW correctly.
It holds because:
* when the smart_ptr is created, the reference is set to 1.
* if you want to pass it to an other thread, you will place it in a
synchronized queue or other mechanism, that will provide the necessary
memory barriers.
* if an other thread is copying it from the initial thread, and the
initial thread can do a modifying operation that triggers COW, then
the accesses must be locked, or you will end in undefined behaviour.
* if a third thread is copying it from a thread that has a copy, but
was not the initial one (on which we suppose we are operating the
COW), then the count was already >1

Note: the other implication (if only one thread has it, then the count
will be seen as 1) does not always hold, but this doesn't affect the
correctness.

Probably there are other legitimate uses of the atomic counter that
requires a memory barrier (or only an aquire), but this is not the
most general case.

[...]
>
> Because the (current) standard does not address threads and such at
> all, different implementations have associated different semantics
> with "volatile" in the presence of threads. I expect that *on
> solaris* one would find a memory barrier generated for this code
> sequence.
>
volatile is specified to always result in a memory operation (i.e. the
compiler is not allowed to cache the value in a register).
The outcome, then, will depend on the processor semantics.
Intel has just documented the memory model for both x86 and ia64
(thanks to gpd for this link):
http://developer.intel.com/products/processor/manuals/318147.pdf

Reading it, it seems to me that, if you always update a memory
location with locked operations (that operate as memory barriers),
then you don't need any acquire when reading that value, to ensure
that you will see all the stores (before the update) from the thread
that updated the value.

Corrado

-- 
__________________________________________________________________________
dott. Corrado Zoccolo                          mailto:zoccolo_at_[hidden]
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------
The self-confidence of a warrior is not the self-confidence of the average
man. The average man seeks certainty in the eyes of the onlooker and calls
that self-confidence. The warrior seeks impeccability in his own eyes and
calls that humbleness.
                               Tales of Power - C. Castaneda

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk