Boost logo

Boost :

Subject: Re: [boost] [lockfree] review
From: Gottlob Frege (gottlobfrege_at_[hidden])
Date: 2011-11-05 02:03:19

On Tue, Aug 30, 2011 at 4:56 AM, Peter Dimov <pdimov_at_[hidden]> wrote:
> Gottlob Frege wrote:
>> Good question.  I think the question at hand (or in my mind) is
>> whether atomic<Foo>, implemented with an internal mutex, is valid.
> It is.
>> Given
>> A) the reordering of
>> m1.lock();
>> x1++;
>> m1.unlock();
>> m2.lock();
>> x2++;
>> m2.unlock();
>> to
>> m1.lock();
>> m2.lock();
>> x2++;
>> x1++;
>> m1.unlock();
>> m2.unlock();
> This reordering is invalid, by the way, which is why Alexander has been
> careful to use "multi_lock". If you add a second example that locks/unlocks
> m2 first, then m1, and reorder it in the same way, the reordered example can
> deadlock, and the original cannot. It somewhat depends on what you mean by
> reordering though. It's probably true that an observer thread can see the
> memory effects of the operations in the order m1.lock/m2.lock/x2++/x1++ by
> using acquire loads (or maybe try_lock, assuming that it doesn't fail
> spuriously, which it's allowed to do) on the control variables of m1 and m2
> and on x1 and x2, but this is not the same as reordering on the thread 1's
> side, whether by the compiler or by the hardware.

Sorry, I didn't mean to disappear for a month just because I was wrong :-)
(I did mention in that email that I wasn't feeling well. Still not
100%, so you are forewarned!)

I was overlapping the "critical sections" that way and wondering if
that was allowed, based on, for example, Herb Sutter's and the usual descriptions of code
floating into a critical section, but not out. (Of course I missed his
footnote mentioning that maybe the acquire/release of 2 blocks
couldn't overlap.)

Of course, the reordering happens in the data reads/writes so it is
not the lock() and potential wait-for-lock-to-be-free that would be
reordered, but, as you mentioned, the memory effects. I'm sure that's
what I meant :-)

The part that surprised me was that the effects of x2++ could be
*seen* before x1++, but of course this could only be seen by another
thread that did NOT use the locks, but instead read x2,x1 directly.
Which means they are no longer atomics-implemented-with-locks. So, in
the normal case, when thread 2 tries to read x2, it first grabs the m2
lock, which sets up the synchronize-with with the m2.unlock() in
thread 1, meaning *eveything* before that is seen first - both x2++
and x1++. Which one "happened" first doesn't matter, they can't be
seen before m2.lock(), at which point they are both visible.

OK, I'm fine now with atomics-via-individual-locks. I was worried for
a second there...


> Either way, the C++MM doesn't "think" in terms of reorderings, it specifies
> behavior of multithreaded programs. If you really think that atomic<Foo>,
> implemented with a mutex, is not sequentially consistent, there must exist
> an example (a multithreaded program) containing atomic<Foo> for which the
> memory model allows an outcome which is not SC.

P.S. I found thinking of atomics-by-locks hard wrt "sync-with". I had
to imagine an atomic<BigObject> which is implemented with a lock which
is in turn implemented with an atomic<int>. ie I'm too much in the
habit of thinking about atomics to be able to think of locks in any
other way than their atomic implementation, so it was hard to avoid
the recursion :-)

Boost list run by bdawes at, gregod at, cpdaniel at, john at