Boost logo

Boost :

From: Andrew Green (ag_at_[hidden])
Date: 2001-01-07 20:27:28


Mike Crawford wrote:

> Please have a look at the ZAtomic.h and ZAtomic.cpp files in ZooLib at:
> http://zoolib.sourceforge.net

The stuff in ZAtomic provides SMP-safe assembly implementations of addition,
exchange and logical operations for 68K, PPC and x86 processors, using
CodeWarrior, gcc and Visual C++. It's nothing fancy, and I would much prefer
to use versions from a standard library, if they existed.

The version that posted at sourceforge has served me and others well, but
I've found a couple of problems that will be fixed in the next ZooLib source
release later this month. Meantime, I've posted the fixes at

  <http://www.zoolib.org/~ag/snippets/ZAtomic_2001_01_07/ZAtomic.cpp>
and
  <http://www.zoolib.org/~ag/snippets/ZAtomic_2001_01_07/ZAtomic.h>

> ZooLib provides thread-safe reference-counted smart pointers, and author Andy
> Green slaved to get this working correctly in a cross-platform way -
> especially considering such things as me finding compiler bugs stimulated
> by his earlier implementations during testing!

I've been relying on reference counting in my own code for several years and
Dejan's comments on his web page resonate with my own experience; rather
than thinking of reference counting as a garbage collection technique,
(which it is useful for although there are better alternatives,) I find it
most useful for alleviating issues of ownership.

The thing that keeps suprising me is just how subtle reference counting is
once you get into a multi-threaded world. The reference counting
implementation that's in ZooLib now is SMP-safe in all situations bar one,
which I'll try to describe.

Using Dejan's terminology, and ignoring the weak reference counts, which
only agravate the issue, assume we have a reference counting pointer that is
accessed by a couple of threads:
    Ptr<Type> shared;

Thread one executes the following sequence:
    Ptr<Type> local1;
    shared = local1;

Whilst thread two does the following:
    Ptr<Type> local2;
    local2 = shared;

Thread one is doing the following:
[1A] if (shared.ptr_)
        {
        decrement shared.refCounts_;
        If the result was zero
            delete shared.ptr_ and shared.refCounts_;
        }
    shared.ptr_ = local1.ptr_; [1B]
    shared.refCounts_ = local1.refCounts_;
    increment shared.refCounts_;

Meantime, thread two is doing the following:
    if (local2.ptr_)
        {
        decrement local2.refCounts_;
        if the result was zero
            delete local2.ptr_ and local2.refCounts_;
        }
[2A]local2.ptr_ = shared.ptr_; local2.refCounts_ = shared.refCounts_;
    increment local2.refCounts; [2B]

If shared's ref count was >1 before we start these two sequences then we
don't have a problem. If shared's refcount _is_ 1 then the sequence of
instructions 1A through 1B must not overlap with the sequence 2A through 2B.
Individual atomic operations do not help in this scenario -- it seems to me
that there's no way to safeguard against this without a larger locking scope
than atomic ops provide.

Now one might say that this problem is obvious. But it wasn't obvious to me.
And there are plenty of situations where the fact that this is what is
happening is not immediately noticeable. If anyone has any insight into how
to make this stuff really safe, I'd love to learn.

A
__________________________________________________________________
Andrew Green mailto:ag_at_[hidden]
The ZooLib Guy http://www.zoolib.org
350 25th Avenue, Suite #1 Vox: +1 (415) 831 2471
San Francisco, CA 94121-1912 Fax: +1 (415) 831 2472


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk