|
Boost : |
From: William Kempf (sirwillard_at_[hidden])
Date: 2001-01-08 10:24:05
--- In boost_at_[hidden], "Andrew Green" <ag_at_e...> wrote:
> Mike Crawford wrote:
>
> > Please have a look at the ZAtomic.h and ZAtomic.cpp files in
ZooLib at:
> > http://zoolib.sourceforge.net
> The thing that keeps suprising me is just how subtle reference
counting is
> once you get into a multi-threaded world. The reference counting
> implementation that's in ZooLib now is SMP-safe in all situations
bar one,
> which I'll try to describe.
>
> Using Dejan's terminology, and ignoring the weak reference counts,
which
> only agravate the issue, assume we have a reference counting
pointer that is
> accessed by a couple of threads:
> Ptr<Type> shared;
>
> Thread one executes the following sequence:
> Ptr<Type> local1;
> shared = local1;
>
> Whilst thread two does the following:
> Ptr<Type> local2;
> local2 = shared;
>
> Thread one is doing the following:
> [1A] if (shared.ptr_)
> {
> decrement shared.refCounts_;
> If the result was zero
> delete shared.ptr_ and shared.refCounts_;
> }
> shared.ptr_ = local1.ptr_; [1B]
> shared.refCounts_ = local1.refCounts_;
> increment shared.refCounts_;
>
> Meantime, thread two is doing the following:
> if (local2.ptr_)
> {
> decrement local2.refCounts_;
> if the result was zero
> delete local2.ptr_ and local2.refCounts_;
> }
> [2A]local2.ptr_ = shared.ptr_; local2.refCounts_ =
shared.refCounts_;
> increment local2.refCounts; [2B]
>
> If shared's ref count was >1 before we start these two sequences
then we
> don't have a problem. If shared's refcount _is_ 1 then the sequence
of
> instructions 1A through 1B must not overlap with the sequence 2A
through 2B.
> Individual atomic operations do not help in this scenario -- it
seems to me
> that there's no way to safeguard against this without a larger
locking scope
> than atomic ops provide.
>
> Now one might say that this problem is obvious. But it wasn't
obvious to me.
> And there are plenty of situations where the fact that this is what
is
> happening is not immediately noticeable. If anyone has any insight
into how
> to make this stuff really safe, I'd love to learn.
I responded to another thread pointing out that I didn't think a ref-
counted smart pointer could be implemented with out a mutex in at
least one spot, and this is exactly the spot I was referring to.
I've tried to search this topic out on the net and every
implementation I've found has made this mistake (or similar, in any
event). I've tried to fathom a way to re-order the instructions that
would allow lock free synchronization here but can't. Right now I'm
inclined to think that it can't be done, but if anyone has found an
implementation that does this correctly or can fathom a way to do so
I'd really like to hear about it. The performance of copies will be
dreadfully impacted with out a lock free implementation.
Bill Kempf
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk