Boost logo

Boost :

Subject: Re: [boost] [atomic] comments
From: Helge Bahmann (hcb_at_[hidden])
Date: 2011-10-21 07:27:28


On Friday 21 October 2011 13:06:20 Tim Blechmann wrote:
> > > shared memory support:
> > > the fallback implementation relies on the spinlock pool that also used
> > > by
> > > the smart pointers. however this pool is per-process, so the fallback
> > > implementation won't work in shared memory. can this be changed/fixed?
> >
> > fixing this would require a per-variable lock... depending on the
> > platform this can have enormous overheads.
> >
> > I would suggest using the compile-time macros BOOST_ATOMIC_*_LOCK_FREE to
> > pick an alternate code path.
>
> then we need some kind of interprocess-specific atomic ... maybe as part of
> boost.interprocess ... iac, maybe we should provide an implementation which
> somehow matches the behavior of c++11 compilers ...

well if the atomics are truely atomic, then BOOST_ATOMIC_*_LOCK_FREE == 2 and
I find a platform where you cannot use them safely between processes
difficult to imagine (not that something like that could not exist)

if they are not atomic, then you are going to hit a "fallback-via locking"
path in whiche case you are almost certainly better off picking an
interprocess communication mechanism that just uses locking directly

> > > atomic::is_lock_free():
> > > is_lock_free is set to either `true' or `false'. however in some cases,
> > > there are alignment constraints (iirc, 64bit atomics on ia32/x86_64
> > > require a 64bit alignment). afaict there are not precautions to take
> > > care of this, are there?
> >
> > for x86_64 there is nothing to do, ABI requires 8 byte alignment already
> >
> > there used to be an __align__(8) to cover ia32, but it got lost... I
> > *think* the "lock" prefix will cover this case nevertheless (at a hefty
> > performance cost, though...)
>
> i see

but you certainly have a point that this alignment should corrected, noted to
be fixed

> > > compile-time vs run-time dispatching:
> > > some instructions are not available on every CPU of a specific
> > > architecture. e.g. cmpxchg8b or cmpxchg16b are not available on all
> > > ia32/x86_64 cpus. i would appreciate if these instructions would not be
> > > used before performing a CPUID check, whether these instructions are
> > > really available (at least in a legacy mode)
> >
> > the correct way to do that is to have different libraries for
> > sub-architectures and have the runtime- linker decide... this requires
> > infrastructure not present in boost
>
> it would be equally correct to have something like:
> static bool has_cmpxchg16b = query_cpuid_for_cmpxchg16b()
>
> if (has_cmpxchg16b)
> use_cmpxchg16b();
> else
> use_fallback();
>
> less bloat and prbly only a minor performance hit ;)

problematic because the compiler must insert a lock to ensure thread-safe
initialization of the "static bool" (thus it is by definition not "lock-free"
any more)

> > > cmpxchg16b:
> > > currently cmpxchg16b doesn't seem to be supported. this instruction is
> > > required for some lock-free data structures (e.g. there is a dequeue
> > > algorithm, that requires a pair of tagged pointers).
> >
> > could do, but cmpxchg16b is dog-slow, the fallback path is going to be
> > faster anyways
>
> in the average, but not in the worst case. for real-time systems it is not
> acceptable that the os preempts a real-time thread while it is holding a
> spinlock.

prio-inheriting mutexes are usually much faster than cmpxchg16b -- use these
for hard real-time (changing the fallback path to use PI mutexes as well
might even be something to consider)

that being said, I can put it in, but I don't think there is value in it

Best regards
Helge


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk