Boost logo

Boost :

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2007-09-10 15:04:43


Phil Endecott wrote:
> Peter Dimov wrote:
>> Phil Endecott:
>>> I note that shared_ptr uses architecture-specific assembler for the
>>> atomic operations needed for thread safe operations, on x86, ia64 and
>>> ppc; it falls back to pthreads for other architectures. Has anyone
>>> quantified the performance benefit of the assembler?

(I'm still curious about this.)

>>> Assuming that the benefit is significant, I'd like to implement it for
>>> ARM. Has anyone else looked at this?
>>>
>>> ARM has a swap instruction. I have a (very vague) recollection that
>>> perhaps some of the newer chips have some other locked instructions
>>> e.g. test-and-set, but I would want to code to the lowest common
>>> denominator i.e. swap only. Is this sufficient for what shared_ptr wants?

>> To answer your question: no, a mere swap instruction is not enough for
>> shared_ptr, it needs atomic increment, decrement and compare and swap.
>
> Well, I think you can implement a spin-lock mutex with swap:
>
> int mutex=0; // 0 = unlocked, 1 = locked
>
> void lock() {
> do {
> int n=1;
> swap(mutex,n); // atomic swap instruction
> } while (n==1); // if n is 1 after the swap, the mutex was already locked
> }
>
> void unlock() {
> mutex=0;
> }
>
>
> So you could using something like that to protect the reference counts,
> rather than falling back to the pthread method. Or alternatively,
> could you use a sentinel value (say -1) in the reference to indicate
> that it's locked:
>
> int refcount;
>
> int read_refcount() {
> do {
> int r = refcount;
> } while (r==-1);
> return r;
> }
>
> int adj_refcount(int adj) {
> int r=-1;
> do {
> swap(refcount,r);
> } while (r==-1);
> refcount = r+adj;
> }

Right, I have had a go at implementing this. The following code seems
to generate plausible assembler but has not been tested. Maybe someone
has a multi-threaded shared pointer test program that does zillions of
shared pointer accesses to see if they ever mess up? But I would prefer
to think that this sort of code can be "proved correct" by the "many eyes"
method! So let me know if you can see any faults. In particular, I
have never used weak pointers, so I don't really know what that stuff
is doing.

This code can also be found at:

https://svn.chezphil.org/boost/trunk/include/boost/detail/

Regards,

Phil.

#ifndef BOOST_DETAIL_SP_COUNTED_BASE_GCC_ARM_HPP_INCLUDED
#define BOOST_DETAIL_SP_COUNTED_BASE_GCC_ARM_HPP_INCLUDED

//
// detail/sp_counted_base_gcc_arm.hpp
//
// Copyright (c) 2001, 2002, 2003 Peter Dimov and Multi Media Ltd.
// Copyright 2004-2005 Peter Dimov
// Copyright 2007 Phil Endecott
//
// Distributed under the Boost Software License, Version 1.0. (See
// accompanying file LICENSE_1_0.txt or copy at
// http://www.boost.org/LICENSE_1_0.txt)
//
// ARM version using swap by Phil Endecott, based on sp_counted_base_nt.hpp
//

#include <typeinfo>

namespace boost
{

namespace detail
{

// All ARM processors since ARM3 (circa 1989) have an atomic swap instruction,
// but apart from a few very recent chips (the v6 architecture, not to be confused
// with the ARM6) this is the only atomic operation that they support.
//
// Implementing shared pointers using only swap requires a different approach
// from that used on other platforms: we store a sentinel value, -1, in the
// count variables to indicate that they are locked. If an attempt to read
// a count gets -1, we spin until a non-negative value is returned.
//
// The following bit of assembler wraps the swap instruction. It takes two
// register operands and one memory operands, i.e.
// swp r1, r2, [r3]
// is equivalent to
// ldw r1, [r3]
// stw r2, [r3]

static inline long atomic_read_and_lock(long& mem)
{
  long val;
  do {
    // Equivalent to:
    // val = mem;
    // mem = -1;
    asm volatile ("swp\t%0, %1, [%2]"
                 :"=r" (val)
                 :"r" (-1),
                  "r" (&mem)
                 :"memory");
  } while (val==-1);
  return val;
}

static inline long atomic_inc(long& mem, long inc)
{
  return mem = atomic_read_and_lock(mem)+inc;
}

class sp_counted_base
{
private:

    sp_counted_base( sp_counted_base const & );
    sp_counted_base & operator= ( sp_counted_base const & );

    long use_count_; // #shared
    long weak_count_; // #weak + (#shared != 0)

public:

    sp_counted_base(): use_count_( 1 ), weak_count_( 1 )
    {
    }

    virtual ~sp_counted_base() // nothrow
    {
    }

    // dispose() is called when use_count_ drops to zero, to release
    // the resources managed by *this.

    virtual void dispose() = 0; // nothrow

    // destroy() is called when weak_count_ drops to zero.

    virtual void destroy() // nothrow
    {
        delete this;
    }

    virtual void * get_deleter( std::type_info const & ti ) = 0;

    void add_ref_copy()
    {
        atomic_inc(use_count_,+1);
    }

    bool add_ref_lock() // true on success
    {
        long c = atomic_read_and_lock(use_count_);
        if (c==0) {
            use_count_ = 0;
            return false;
        }
        use_count_ = c+1;
        return true;
    }

    void release() // nothrow
    {
        if( atomic_inc(use_count_,-1) == 0 )
        {
            dispose();
            weak_release();
        }
    }

    void weak_add_ref() // nothrow
    {
        atomic_inc(weak_count_,+1);
    }

    void weak_release() // nothrow
    {
        if( atomic_inc(weak_count_,-1) == 0 )
        {
            destroy();
        }
    }

    long use_count() const // nothrow
    {
        while (1) {
            long c = use_count_;
            if (c>=0) return c;
        }
    }
};

} // namespace detail

} // namespace boost

#endif // #ifndef BOOST_DETAIL_SP_COUNTED_BASE_GCC_ARM_HPP_INCLUDED


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk