Boost logo

Boost :

From: Bill Wade (bill.wade_at_[hidden])
Date: 2000-06-09 08:34:43


It is my understanding that at least on modern Intel, SMP does not make
32-bit reads/writes any less atomic. The fact that the operations are
atomic can reduce efficiency (access to word N in processor 1 may slow
access to word N+1 in processor 2, because its cache gets flushed).

In "Hoard: A Scalable Memory Allocator for Multithreaded Applications"
(sorry, no URL right now, but I did get it from online) the authors make the
point that heap blocks allocated by different processors should not be in
the same cache line because of performance hits related to cache flushing
(not by the allocator, but by code using the allocated blocks) even when no
explicit syncrhonization calls are made. They call this "false sharing."
The implication is that (at least on many SMP systems) a certain amount of
synchronization is implicit and unavoidable.

Anyway, if my understanding is correct, no particular compile time options
seem to be really necessary for Intel SMP. You couldn't get "fast but not
atomic" reads/writes (32-bit aligned) on Intel SMP, even if you wanted them.

You certainly might want to know about the availability of MP for
non-synchronization reasons (use an MP implementation of find() or copy())
but the need is less obvious (to me at least) for low-level synchronization
primitives.

Any Intel (or other) SMP gurus out there who can correct me?


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk