Boost logo

Boost :

Subject: Re: [boost] [lock-free] CDS -yet another lock-free library
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2010-03-29 11:22:37


Khiszinsky, Maxim wrote:
>> Have you considered using the proposed Boost.Atomic? This should
>> support more platforms.
>>
>> If there is something missing from Boost.Atomic that is needed for this
>> purpose, it would be useful to know about it.
>
> Yes, I considered Boost.Atomic some time ago and I decided to
> implement separate atomics in the CDS library:
>
> 1. Not all processor architectures that I need are implemented in
> Boost.Atomic (maybe, today it is not right).

You listed x86, amd64, ia64, sparc; Boost.Atomic currently supports (I
think) x86, amd64, alpha, ppc and arm. Of course ia64 and sparc
implementations for Boost.Atomic would be useful.

> 2. Function-based implementation of atomics produces non-optimal
> code in some cases. Consider the usual implementation of atomic with
> explicit memory ordering:
> Static inline void store( atomic_t * pDest, atomic_t nVal, memory_order order )
> {
> switch ( order ) {
> case memory_order_relaxed: *pDest = nVal; break ;
> case ...
> case ...
> }
> }
> The problem is that the compiler (in some cases) generates case-based
> code when 'order' parameter is constant for caller:
> store( &myAtomic, 10, memory_order_relaxed) ;
> in this case instead of ONE assembler store instruction the compiler
> may generate many branch instruction. It is not optimal :-(. And 99%
> of code with atomic primitives has *constant* memory_order parameter.

I would like to think that all modern compilers could get this right,
at least if the right level of optimisation were enabled. Can you
please tell us in what case you have observed this?

> More optimized implementation of atomic is template-based:
>
> Template <memory_order ORDER>
> void store( atomic_t * pDest, atomic_t nVal ) ;
> template <>
> void store<memory_order_relaxed>( pDest, nVal )
> {
> *pDest = nVal ;
> }
> ...
> And so on for each memory_order constant.
> Template-based implementation is more optimized: the memory_order
> constant is a *compiler selector* to choose appropriate implementation.
> No compiler optimization required.

If it's true that compilers get this wrong then an approach like your
suggestion should be considered. However it's not just Boost.Atomic
that would need to be re-done but also the spec for the proposed c++0x
features, which would be more difficult (!).

Hopefully Helge will notice this thread soon...

Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk