Boost logo

Boost :

From: William Kempf (sirwillard_at_[hidden])
Date: 2000-09-06 16:54:39

I've updated in the Threads folder with a newer
implementation. This implementation is quite interesting for several
reasons. If you're a Win32 programmer you'll be interested in the
implementation because it makes the semaphore and the mutex classes
operate much faster than the native Win32 equivalents and nearly as
fast as the Win32 critical section. (I've not done timings, but the
simple test harness illustrates this visibly!)

If you're not a Win32 programmer but are interested in the threading
library you may be interested in the implementation. There are now
only two classes that use native code: atomic_t and semaphore. The
atomic_t class is obviously preliminary, since we've hardly finished
discussing it, but it's needed for the implementation of the
semaphore. The semaphore has a slightly different interface than
that in Jeremy's Concept document, more closely resembling the Win32
semaphore concept, but this allows all the other synchronization
types to be built off of it. The semaphore uses a single Win32
native type: a Win32 event! Blocking is controlled first by a spin
lock and then blocks on the event only if it truly needs to block.
This is what makes it nearly as fast as a critical section. A native
synchronization type is needed here to insure that we don't busy wait
unless we have to (the semaphore will be used for long term locking).

Because all other types are built off of atomic_t and semaphore it
should be easier to port to other platforms. At least that's the
theory. It also illustrates why I think we need the primitives.
They can be used to build higher level abstractions. Each level will
then be used to build yet higher level abstractions, etc.

The atomic_t class is currently implemented off of the Win32
Interlocked* functions for ease of implementation (I'm not an
assembly programmer). However, for portability reasons, I think it
might be better to implement this in terms of the assembler
instructions, since this will port according to hardware instead of
OS. This is just a thought.

I've learned a bit about the Win32 atomic instructions as well. The
reason that Win95 supports such a small set of atomic functions is
because Win95 supports the 386. Several atomic operations were added
with the 486, so Win32 OSes that don't need to support the 386 have a
wider range of atomic operations. This does mean that we may need to
consider supporting only the limited set ourselves
(read/write/exchange/increment/decrement), since they are the most
likely to be available on differing architectures and cover most
necessary uses (the others can be thought of as optimizations).

I also think I understand why I've always heard you should use the
atomic instructions to read/write. It has to do with multiple CPUs,
memory cache and the delay that would occur with a simple write. A
simple write may be atomic, but the Interlocked* functions insure
speedy changes across CPU caches (at the expense of a memory lock,
but this is faster than the cache synching if I understand what I
read correctly). If anyone understand the x86 architecture maybe
they can better explain this to me (and anyone else who may be

Bill Kempf

Boost list run by bdawes at, gregod at, cpdaniel at, john at