Boost logo

Boost :

From: Giovanni Piero Deretta (gpderetta_at_[hidden])
Date: 2008-04-18 06:51:41


On Fri, Apr 18, 2008 at 3:09 AM, Cory Nelson <phrosty_at_[hidden]> wrote:
> On Thu, Apr 17, 2008 at 5:24 PM, Patrick Twohig <p-twohig_at_[hidden]> wrote:
> > Theoretically, the CAS should atomically compare and swap the value in one
> > clock cycle. However, with multiple cores/processors/hyper threading where
> > multiple instructions are being executed simultaneously over arbitrary
> > numbers of clock cycles. There can be writes pending while you want to read
> > from memory. As a result, when you go to read something another process
> > will have written to but you read stale data. To combat this, you enforce a
> > memory barrier, which guarantees that all pending memory transactions before
> > the barrier have completed before moving on with the program. Additionally,
> > some architectures (like x86) allow for unaligned access of memory. When an
> > unaligned value is accessed, it sets an exception then it replaces the
> > single read/write operation with multiple bus operations which wreaks havoc
> > on any compare/swap operations.
>
> They don't happen in a single cycle, I don't think there is anything
> specifying that they should. Barriers aren't needed on x86 or x64,
> other than compile-time only ones to make sure the compiler doesn't
> reorder something.
>

Well, you actually need StoreLoad memory barriers on x86. All other
barriers are always implicit (unless you use non temporal SSE
store/loads).

StoreLoad is also implicit if you use locked operations, otherwise you
need an explicit mfence.

See, for example, http://g.oswego.edu/dl/jmm/cookbook.html

-- 
gpd

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk