Boost logo

Boost :

Subject: Re: [boost] [lockfree] review
From: Alexander Terekhov (terekhov_at_[hidden])
Date: 2011-08-23 09:19:54


Dave Abrahams wrote:

[... memory model ...]

> It's not really different than locking. If you want to write to shared
> data, you need some way of making it not-a-race. It's just that when
> the data structure is small enough (like an int) you can make it atomic
> instead of putting a lock around it.

No. See:

http://www.cl.cam.ac.uk/~pes20/cppppc/

Note that the proposed MM is still incomplete by (currently) not
supporting atomic RMW operations (load-reserve/store-conditional)
which are essential for locking.

regards,
alexander.

P.S. I don't like C++11 MM atomics, I think that atomic loads and
stores ought to support the following 'modes':

  Whether load/store is competing (default) or not. Competing load
  means that there might be concurrent store (to the same object).
  Competing store means that there might be concurrent load or
  store. Non-competing load/store can be performed non-atomically.

  Whether competing load/store needs remote write atomicity (default
  is no remote write atomicity). A remote-write-atomicity-yes load
  triggers undefined behaivior in the case of concurrent remote-
  write-atomicity-no store.

  Whether load/store has specified reordering constraint (default
  is no constraint specified) in terms of the following reordering
  modes:

    Whether preceding loads (in program order) can be reordered
    across it (can by default).

    Whether preceding stores (in program order) can be reordered
    across it (can by default).

    Whether subsequent loads (in program order) can be reordered
    across it (can by default). For load, the set of constrained
    subsequent loads can be limited to only dependant loads (aka
    'consume' mode).

    Whether subsequent stores (in program order) can be reordered
    across it (can by default). For load, there is an implicit
    reordering constraint regarding dependent stores (no need to
    specify it).

    A fence/barrier operation can be used to specify reordering
    constraint using basically the same modes.

Re C++11 MM, I'm still missing more fine-grained memory order
labels such as in pseudo C++ example below.

(I mean mo::noncompeting, mo::ssb/ssb_t (sink store barrier, a
release not affecting preceding loads), slb/slb_t (a release not
affecting preceding stores) below, and somesuch for relaxed acquire)

// Introspection (for bool argument below) aside for a moment
template<typename T, bool copy_ctor_or_dtor_can_mutate_object>
class mutex_and_condvar_free_single_producer_single_consumer {

  typedef isolated< aligned_storage< T > > ELEM;

  size_t m_size; // > 1
  ELEM * m_elem; // array of elements, init'ed by ctor
  atomic< ELEM * > m_head; // initially == m_elem
  atomic< ELEM * > m_tail; // initially == m_elem

  ELEM * advance(ELEM * elem) const {
    return (++elem < m_elem + m_size) ? elem : m_elem;
  }

public:

  mutex_and_condvar_free_single_producer_single_consumer(); // ctor
 ~mutex_and_condvar_free_single_producer_single_consumer(); // dtor

  void producer(const T & value) {
    ELEM * tail = m_tail.load(mo::noncompeting); // may be nonatomic
    ELEM * next = advance(tail);
    while (next == m_head.load(mo::relaxed)) usleep(1000);
    new(tail) T(value); // placement copy ctor (make queued copy)
    m_tail.store(next, mo::ssb); // cheaper than mo::release
  }

  T consumer() {
    ELEM * head = m_head.load(mo::noncompeting); // may be nonatomic
    while (head == m_tail.load(mo::consume)) usleep(1000);
    T value(*head); // T's copy ctor (make a copy to return)
    head->~T(); // T's dtor (cleanup for queued copy)
    m_head.store(advance(head), type_list< mo::slb_t, mo::rel_t >::
      element<copy_ctor_or_dtor_can_mutate_object>::type());
    return value; // return copied T
  }

};

Note also that given that example above presumes that no more than
one thread can read from relevant atomic locations while they are
written concurrently, there is definitely no need to pay the price
of remote write atomicity even if it is run on 3+ way
multiprocessor... IOW, hwsync is unneeded even if all mo::* above
are changed to SC... but upcoming C++11 MM doesn't allow to express
no-need-for-remote-write-atomicity for SC atomics.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk