From: Matt Hurd (matt.hurd_at_[hidden])
Date: 2005-04-27 01:34:35
On 4/27/05, Felipe Magno de Almeida <felipe.m.almeida_at_[hidden]> wrote:
> I dont know if this thread is the best place to make this question, but...
> Where do I find the proposals about threads in C++? libs, core
> language modifications, memory model and etc... ? And I've seen
> somewhere that there's some group discussing about those things
> either, but I had only access to it through a page... maybe there's
> somehow for others to receive those mails? Even havent permission to
> reply any? I'm really interested about it, but I fear things like
> atomicity garantees to volatile and static variables and some others
> too strong garantees...
The latest thinking seems to be not to change anything that would
incur overhead as per the latest mailing:
This is in keeping with the seemingly fundamental c++ principle of not
paying for what you don't use.
The consensus within our group (grudgingly for some of us) is that it would
be preferable to leave the semantics of data races undefined, mostly because
it is much more consistent with the current C++ specification, practice, and
especially implementations. In the absence of objections from the committee,
we plan to follow that path.
Though I find much of the paper of interest the useful portable
implementation of such is such a long way away that boost cover much
of this territory via libs.
For some architectures such a library approach may not be possible.
For example, I'm not sure you can do a load_store barrier by library
on ia64 when I look at the JSR133 docs. This seems to be the only
example of popular architecture where a library approach may not be
In the meantime so things such as statics in functions should perhaps
just be declared as concurrently unsafe and avoided.
I think boost should do a few things:
1) memory model
Acknowledge there is none. Therefore must assume a data race. Have a
practical portable memory barrier available:
Have a load_load, load_store, store_store, store_load primitive set of
functions for memory barriers modelled after the JSR 133 cook book
This should also introduce platform labels for architectures in boost
in addition to the current compiler / OS #defines.
I think we need to assume that lock/unlock synchronisation of a
non-null mutex is a full barrier. Is that correct? Perhaps having
such primitives queriable by trait is enough...
2) atomic read writes
There should a type trait for a type that returns if the type is
sufficiently small for atomic reading or writing to memory for the
given architecture being compiled for.
This type trait may be used for generic or macro methods of
introducing or, perhaps more importantly, avoiding synchronisation.
I use such a technique in some of my code successfully. Especially
useful for helping with synch-safe property style interfaces.
3) assuring real memory
Not just a promotion to register. This is required so that when a
memory barrier is invoked it is acting on the necessary parts of the
code we expect it to and the variable we are interested in sharing is
actually shareable and not aggressively optimized into a cpu specific
Is volatile a practical way to do this (perhaps in co-operation with
compiler optimization settings)? Can we assume "volatile" assures
this and the var will not be a register only optimisation. Is there a
better way. I think this is the only guarantee we need from volatile.
Atomicity of reads and writes might be nice for volatile, but this is
out of scope of a library.
4) further synch primitives
a synch lib that provides the usual suspects of atomic ops such as
inc, dec, add, sub, cas. These are normally defined for an
architecture on a certain width of bit field. Thus a trait mechanism
should indicate native support for a type.
Perhaps a trait mechanism for what type of memory barrier equivalent
guarantees have been provided by the operation.
Perhaps generic implementations that, at worst, use a full mutex, for
5) better generic synchronization api for mutex operations
Should be able to use null_synch, simple_synch and shared_synch ( or
rw_synch ) primitives in a policy like manner so that we can write
concurrent aware constructs to a shared/exclusive model and have that
work for no concurrency, exclusive concurrency and shared/exclusive
concurrency by policy. ACE has had something similar for over ten
years. I use a simple type translation layer over boost::thread to
achieve a similar outcome.
6) threading api
Then we can worry more about an appropriate threading api and the
religion of cancelling threads, OS specific operations, propagation of
exceptions, and the like.
7) cost model
It would be nice to include traits with approximate relative costs for
architectures for some operations as these can be vastly different.
Many ops on many architectures might be zero cost. For example the
stronger memory model of x86 gives you a NOP, zero cost, for
load_store and store_store barriers. This will become increasingly
important in developing, so called, lock free methods. I call them
"so called" as they usually require memory model guarantees through
barriers or atomic ops, such as CAS, and these can be very expensive
on some architectures. Lock free algorithms are often more
complicated and if the non-locking concurrency costs are sufficiently
high the benefits can quickly dissipate. For example, locking
operations on an intel P4 are expensive. On an Athlon and ia64 they
are considerably cheaper by more than a factor of two. This, and
varying barrier, synch primitive costs, changes the appropriateness of
an algorithm for an architecture but I can imagine fairly simple
compile time formulae for determining 80% of class / algorithm
selection problems transparently to a user.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk