Boost logo

Boost :

From: Sid Sacek (ssacek_at_[hidden])
Date: 2007-09-15 23:35:35


I don't like the spin-lock approach for the atomic variables.

If the mutex is already locked by another thread, spinning just wastes CPU
cycles on a single-processor machine until the time-slice for the current
thread comes to an end. That's usually a 10-millisecond waste of CPU in the
worst case scenario.

It might have some benefit on a multi-processor machine, but only if the
thread that currently owns the mutex lock has not already been pre-empted.

I don't see much benefit of the spin-lock approach over a system mutex,
since the system mutex at least allows other threads to continue running
while the mutex is being held.

A hybrid of a spin-lock and mutex sounds better to me than a simple
spin-lock. What I mean is that if the spin-lock does not terminate its loop
in short order, that means the lock loop is going last for a very long time
( possibly up to 10-milliseconds. ) If the loop does not exit quickly, then
yield the processor to some other thread, (possibly even the thread that's
holding the lock in the first place.)

There must be another 'better' method than the one which is described below.
I don't have any good answers right now myself, but I hope that my concern
was understood.

-Sid

-----Original Message-----
From: boost-bounces_at_[hidden] [mailto:boost-bounces_at_[hidden]]
On Behalf Of Phil Endecott
Sent: Monday, September 10, 2007 3:05 PM
To: boost_at_[hidden]
Subject: Re: [boost] Shared pointer atomic assembler for ARM?

Phil Endecott wrote:
> Peter Dimov wrote:
>> Phil Endecott:
>>> I note that shared_ptr uses architecture-specific assembler for the
>>> atomic operations needed for thread safe operations, on x86, ia64 and
>>> ppc; it falls back to pthreads for other architectures. Has anyone
>>> quantified the performance benefit of the assembler?

(I'm still curious about this.)

>>> Assuming that the benefit is significant, I'd like to implement it for
>>> ARM. Has anyone else looked at this?
>>>
>>> ARM has a swap instruction. I have a (very vague) recollection that
>>> perhaps some of the newer chips have some other locked instructions
>>> e.g. test-and-set, but I would want to code to the lowest common
>>> denominator i.e. swap only. Is this sufficient for what shared_ptr
wants?

>> To answer your question: no, a mere swap instruction is not enough for
>> shared_ptr, it needs atomic increment, decrement and compare and swap.
>
> Well, I think you can implement a spin-lock mutex with swap:
>
> int mutex=0; // 0 = unlocked, 1 = locked
>
> void lock() {
> do {
> int n=1;
> swap(mutex,n); // atomic swap instruction
> } while (n==1); // if n is 1 after the swap, the mutex was already
locked
> }
>
> void unlock() {
> mutex=0;
> }
>
>
> So you could using something like that to protect the reference counts,
> rather than falling back to the pthread method. Or alternatively,
> could you use a sentinel value (say -1) in the reference to indicate
> that it's locked:
>
> int refcount;
>
> int read_refcount() {
> do {
> int r = refcount;
> } while (r==-1);
> return r;
> }
>
> int adj_refcount(int adj) {
> int r=-1;
> do {
> swap(refcount,r);
> } while (r==-1);
> refcount = r+adj;
> }

Right, I have had a go at implementing this. The following code seems
to generate plausible assembler but has not been tested. Maybe someone
has a multi-threaded shared pointer test program that does zillions of
shared pointer accesses to see if they ever mess up? But I would prefer
to think that this sort of code can be "proved correct" by the "many eyes"
method! So let me know if you can see any faults. In particular, I
have never used weak pointers, so I don't really know what that stuff
is doing.

This code can also be found at:

https://svn.chezphil.org/boost/trunk/include/boost/detail/

Regards,

Phil.

#ifndef BOOST_DETAIL_SP_COUNTED_BASE_GCC_ARM_HPP_INCLUDED
#define BOOST_DETAIL_SP_COUNTED_BASE_GCC_ARM_HPP_INCLUDED

//
// detail/sp_counted_base_gcc_arm.hpp
//
// Copyright (c) 2001, 2002, 2003 Peter Dimov and Multi Media Ltd.
// Copyright 2004-2005 Peter Dimov
// Copyright 2007 Phil Endecott
//
// Distributed under the Boost Software License, Version 1.0. (See
// accompanying file LICENSE_1_0.txt or copy at
// http://www.boost.org/LICENSE_1_0.txt)
//
// ARM version using swap by Phil Endecott, based on sp_counted_base_nt.hpp
//

#include <typeinfo>

namespace boost
{

namespace detail
{

// All ARM processors since ARM3 (circa 1989) have an atomic swap
instruction,
// but apart from a few very recent chips (the v6 architecture, not to be
confused
// with the ARM6) this is the only atomic operation that they support.
//
// Implementing shared pointers using only swap requires a different
approach
// from that used on other platforms: we store a sentinel value, -1, in the
// count variables to indicate that they are locked. If an attempt to read
// a count gets -1, we spin until a non-negative value is returned.
//
// The following bit of assembler wraps the swap instruction. It takes two
// register operands and one memory operands, i.e.
// swp r1, r2, [r3]
// is equivalent to
// ldw r1, [r3]
// stw r2, [r3]

static inline long atomic_read_and_lock(long& mem)
{
  long val;
  do {
    // Equivalent to:
    // val = mem;
    // mem = -1;
    asm volatile ("swp\t%0, %1, [%2]"
                 :"=r" (val)
                 :"r" (-1),
                  "r" (&mem)
                 :"memory");
  } while (val==-1);
  return val;
}

static inline long atomic_inc(long& mem, long inc)
{
  return mem = atomic_read_and_lock(mem)+inc;
}

class sp_counted_base
{
private:

    sp_counted_base( sp_counted_base const & );
    sp_counted_base & operator= ( sp_counted_base const & );

    long use_count_; // #shared
    long weak_count_; // #weak + (#shared != 0)

public:

    sp_counted_base(): use_count_( 1 ), weak_count_( 1 )
    {
    }

    virtual ~sp_counted_base() // nothrow
    {
    }

    // dispose() is called when use_count_ drops to zero, to release
    // the resources managed by *this.

    virtual void dispose() = 0; // nothrow

    // destroy() is called when weak_count_ drops to zero.

    virtual void destroy() // nothrow
    {
        delete this;
    }

    virtual void * get_deleter( std::type_info const & ti ) = 0;

    void add_ref_copy()
    {
        atomic_inc(use_count_,+1);
    }

    bool add_ref_lock() // true on success
    {
        long c = atomic_read_and_lock(use_count_);
        if (c==0) {
            use_count_ = 0;
            return false;
        }
        use_count_ = c+1;
        return true;
    }

    void release() // nothrow
    {
        if( atomic_inc(use_count_,-1) == 0 )
        {
            dispose();
            weak_release();
        }
    }

    void weak_add_ref() // nothrow
    {
        atomic_inc(weak_count_,+1);
    }

    void weak_release() // nothrow
    {
        if( atomic_inc(weak_count_,-1) == 0 )
        {
            destroy();
        }
    }

    long use_count() const // nothrow
    {
        while (1) {
            long c = use_count_;
            if (c>=0) return c;
        }
    }
};

} // namespace detail

} // namespace boost

#endif // #ifndef BOOST_DETAIL_SP_COUNTED_BASE_GCC_ARM_HPP_INCLUDED

_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk