Boost logo

Boost :

Subject: Re: [boost] Proposal for a thread synchronisation primitive: future_queue
From: Gavin Lambert (gavinl_at_[hidden])
Date: 2018-05-09 06:43:34


On 9/05/2018 17:30, Tim Blechmann wrote:
>>> On a second thought, maybe I could even change the implementation of
>>> the future_queue to internally use e.g. the well-known
>>> boost::lockfree queues together with some semaphore to implement the
>>> blocking behaviour. I am not 100% sure yet, but I think all current
>>> features could be implemented by that. Do you think this would
>>> improve the implementation? It might be a bit simpler and maybe even
>>> more efficient in some cases.
>>
>> I would love to see a portable eventcount
>> (http://www.1024cores.net/home/lock-free-algorithms/eventcounts)
>> implementation in Boost.Lockfree.
>
> i'm not sure, if boost.lockfree is a good place for it (is it even
> possible to implement them without relying on system calls, which may
> acquire locks in kernel-space?)

Even the Boost.LockFree queue may not always be lock-free -- eg. if you
push more nodes than the queue's capacity (and it's not fixed_sized) it
will allocate more, which will often take a Big System Lock (though
modern allocators try to use smaller less-contended locks to reduce the
impact).

And these things are usually implemented with a fast-path that does not
perform any kernel operations, and a slow-path that does (along with the
hope that the slow-path is not triggered too often).

As soon as you start talking about wanting to block, you have to involve
the kernel (or a userspace scheduler). This is a choice to gain some
convenience while sacrificing some "lock-free-ness", just like the
choice to use a non-fixed queue.

The basic "lock-free but blocking" expected behaviour is:

   1. If pushing but the queue is full, then block until it isn't
(blocking path)
   2. If a popper is waiting, push and then wake them up (slow-path)
   3. Otherwise just push without kernel operations (fast-path)

And similarly on the consuming end for an empty queue. (And a
non-fixed-size queue might omit step #1 entirely, though still end up
blocking a little if it hits an allocator.)

The main trick is in the wakeups (and reliably knowing whether someone
is waiting); you can never really avoid making kernel calls to wake up
another thread, but you want to do so only when actually needed (erring
on the side of doing an unnecessary wakeup rather than never doing a
wakeup), and you need to use constructs that (while perhaps not
completely lock-free) are at least wait-free.

Semaphores and Win32 events are usually good candidates for this,
because the scheduler generally treats them as atomic operations as far
as user code is concerned. Traditional mutexes (and condition
variables, since they rely on those) are bad candidates because the
thread might be preempted while a lock is held, which prevents forward
progress of another thread.

But you also usually have to be careful to use the "true"
synchronisation primitives of the platform, not emulations defined in
libraries, because those might use locks or other constructs that are
problematic. When you're implementing a platform-specific abstraction,
it's one of the rare cases that it's not a good idea to depend on other
abstractions.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk