On Wed, Jul 30, 2014 at 11:39 AM, Gavin Lambert <gavinl@compacsort.com> wrote:

On 30/07/2014 20:35, Klaim - Joël Lamotte wrote:

A really simple (albeit heavy-handed) way of doing this lock-free
(or almost lock-free, depending on shared_ptr implementation) is to
have your data in a shared_ptr; whenever something wants to act on
the data it uses atomic_load to fetch it from some well-known
location and can then do whatever read-only access it wishes; when
it wants to change the data it first copies the entire object, makes
the changes in the copy, and then uses atomic_compare_exchange to
replace the "real" version. If the compare fails, it lost the race
to a faster writer so must perform its updates again on the new data
and try exchanging again.

This would work I guess but not well with big values (like a big struct).
Also, from my understanding, having one "copy" of the object for each
subscriber seems to be potentially more efficient if you already know
that each subscriber work with it's own thread (or set of threads).

Copies are not efficient for a big struct, but copies only occur on write so it's not a big deal if writes are rare.

Yes I believe that at least in my use case it makes sense.

Having a separate copy for each subscriber is helpful to avoid write contention, but if everything is only reading from a single shared object that is seldom written to then I don't think it provides any benefit (but I'm not an expert on cache effects).

I'm not an expert either so I might be pessimizing that part.

I started an implementation and will try to make the interface not impact the implementation so that I can try different things.

My gut feeling is that which approach is "better" depends on the number of subscribers and the frequency of actions. You do pay a bit for an atomic load/exchange (it's not a lot, but it's still something you want to avoid doing in a tight loop), but it means you only need to copy once; conversely having copies for each subscriber lets you avoid the atomics but requires you to make N copies.

In my case it's ok to have N copies, but yeah I have to make sure the user understand this cost.

There are other tradeoffs as well, of course -- the one I proposed above doesn't have notifications on change, it just lets existing operations continue using the old data while new operations silently pick up the new data (basically a pull model); while a push model explicitly notifies subscribers that new data is available and lets them potentially do something esoteric if required -- but if implemented naively may result in the subscribers all being called from the publisher's thread, which may introduce contention and cache fragmentation.

I'll try to make my implementation possible to use either with callbacks called using executors, or through an regular pull call.

Both seems interesting in different cases and it don't seem to me (at the moment) that implementation of both would be mutually exclusive.

I'll report here when I have something that I can at least use in my own project.

_______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users