Boost logo

Boost :

Subject: Re: [boost] [atomic] comments
From: Helge Bahmann (hcb_at_[hidden])
Date: 2011-10-31 16:38:50


On Monday 31 October 2011 19:29:35 Andrey Semashev wrote:
> > considering the cost of cmpxchg8b itself, the cost of a branch -- if done
> > correctly [1] -- is most likely immeasurable
>
> Probably. But I'm a perfectionist. :)

me too, but if it does not have a measurable detriment, I consider it
perfect :)

> > > Unfortunately, cmpxchg16b is not as common as cmpxchg8b, so a dynamic
> > > check would be desirable. However, I would prefer that there were no
> > > if's like the one above. Perhaps, a global table of pointers to the
> > > actual function implementations would be better. Initially pointers
> > > should point to functions that perform cpuid and initialize this table
> > > and then call the real functions for the detected hardware. This way we
> > > eliminate almost all overhead in the long run, including call_once.
> >
> > the processor most likely has more difficulties correctly predicting the
> > code flow through a register-indirect branch than a static one, so I am
> > not really sure this is cheaper, but it is in any case worth trying out
>
> Yes, this needs testing, however I hope that unconditional jump should be
> quite well predictable.

it's only predictable as long as it is in the BTB, as soon as it gets
flushed -- out of luck

branch to static address, to out-of-line forward address to hit "predict not
taken" default assumption on cold cache on the other hand is still
essentially free

> > also, this would not be a "single" function pointer but a whole bunch of
> > them to cover the different atomic operations (reducing everything to CAS
> > generates more lock/unlock cycles in the fallback path otherwise)
>
> Sure, like I said - a table of pointers.

since boost.atomic is (supposed) to stay a header-only library, there are
cases where these will be instantiated multiple times -- the many different
pointers may pressure the BTB unduly

Best regards
Helge


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk