|
Boost : |
From: Alexander Terekhov (terekhov_at_[hidden])
Date: 2004-06-09 05:13:11
scott wrote:
[...]
> The actual bug that I created was related to the mutex
> controlling instantiation. Essentially if there is a
> multi-access issue around instantiation, how does adding
> a mutex make it better, i.e. how does the mutex get
> constructed (safely)?
See boost.thread once() implementation for windows. I call it "lazy
mutex". Note that the use of interlocked there is a bit braindamaged
(but it's hard to do better given that present compilers don't
understand more generic memory barriers [hoist and sink stuff***]).
Here's some "portable" double-checked init in dynamic [not static]
context. Just an illustration.
< double-checked serialized init >
DCSI-TLS:
class stuff : private lazy_mutex { // "create/open named mutex"
// trick on windows
const lazy * m_ptr;
thread_specific_ptr<lazy, no_cleanup> m_tsp;
public:
/* ... */
const lazy & lazy_instance() {
const lazy * ptr;
if (!(ptr = m_tsp.get())) {
lazy_mutex::guard guard(this);
if (!m_ptr) m_ptr = new lazy();
m_tsp.set(ptr = m_ptr);
}
return *ptr;
}
DCSI-MBR:
class stuff : private lazy_mutex { // "create/open named mutex"
// trick on windows
atomic<const lazy *> m_ptr;
public:
/* ... */
const lazy & lazy_instance() {
const lazy * ptr;
if (!(ptr = m_ptr.load(msync::hlb))) {
lazy_mutex::guard guard(this);
if (!(ptr = m_ptr.load(msync::none)))
m_ptr.store(ptr = new lazy(), msync::ssb);
}
return *ptr;
}
}
DCCI: (lockless double-checked concurrent init)
class stuff {
atomic<const lazy *> m_ptr;
public:
/* ... */
const lazy & lazy_instance() {
const lazy * ptr;
if (!(ptr = m_ptr.load(msync::hlb)) &&
!m_ptr.attempt_update(0, ptr = new lazy(), msync::ssb)) {
delete ptr;
ptr = m_ptr.load(msync::hlb);
}
return *ptr;
}
}
"hlb" stands for "hoist load barrier" and "ssb" stands for "sink
store barrier".
>
> In the notes this is the following line on page 8.
Slides aren't bad up to Pg. 22. "Multiprocessors, Cache Coherency,
and Memory Barriers" part is somewhat screwed. For example, Pg. 34
and Pg. 35:
Keyboard* temp = pInstance;
Perform acquire;
...
(that notation sucks, BTW) is not really the same (with respect to
reordering) as
Keyboard* temp = pInstance;
Lock L1(args); // acquire
...
because the later can be transformed to
Lock L1(args); // acquire
Keyboard* temp = pInstance;
...
While it does stress the point of Pg. 36, the difference is quite
significant and it can really hurt you in some other context. Beware.
regards,
alexander.
P.S. < synchronized static locals aside for a moment >
typedef aligned_storage< once_call< void > > pthread_once_t;
#define PTHREAD_ONCE_INIT ... magic ...
extern "C" int pthread_once(pthread_once_t * once_control,
void (* init_routine)()) {
once_control->object()(init_routine);
return 0;
}
extern "C++" int pthread_once(pthread_once_t * once_control,
void (* init_routine)()) {
once_control->object()(init_routine);
return 0;
}
***) < barriers (data dependent stuff aside for a moment) >
Full fence means "noop.acquire+release". Acquire operation/access
prevents subsequent (in program order) memory accesses from moving
"up in time" to before the acquire operation; IOW, it prevents
hoisting above acquire operation. Release operation/access prevents
prior (in program order) memory accesses from moving "down in time"
to after the release operation; IOW, it prevents sinking below
release operation. op.acquire is op with "hoist-load+hoist-store"
constraints, op.release is op with "sink-load+sink-store"
constraints, StoreLoad fence is noop with "sink-store+hoist-load"
constraints rdlock() is op with "hoist-load" barrier (just like
the first check in DCSI-MBR and DCCI above), and rdunlock() is op
with "sink-load" barrier.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk