Boost logo

Boost :

Subject: Re: [boost] [thread] Alternate future implementation and future islands.
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2015-03-19 07:09:43


On 17 Mar 2015 at 21:42, Giovanni Piero Deretta wrote:

> I just wanted to share my attempt on a solution on the problem. The
> code can be found at
> https://github.com/gpderetta/libtask/blob/master/future.hpp (other
> interesting files are shared_future.hpp event.hpp and
> tests/future_test.cpp). It is an implementation a subset of the
> current boost and std future interface. In particular it has promise,
> future, shared_future, future::then, wait_any, wait_all. Most
> important missing piece are timed waits (for lack of, ahem, time, but
> should be easy to implement). The implementation requires c++14 and is
> only lightly tested, it should be treated as a proof of concept, not
> production-ready code.

Firstly, thanks for your input, especially one supported with code.

> * The wait strategy is not part of the future implementation (although
> a default is provided). future::get, future::wait, wait_all and
> wait_any are parametrized by the wait strategy.
>
> * The design aims to streamline and make as fast as possible future
> and promise at the cost of a slower shared_future (although there is
> room for improvement).

Your future still allocates memory, and is therefore costing about
1000 CPU cycles. That is not "as fast as possible". The use of
std::atomic also prevents the compiler optimiser from eliding the
future implementation entirely, because atomic always forces code to
be generated. As much as futures today are big heavy things,
tomorrow's C++17 futures especially resumable function integrated
ones must be completely elidable, then the compiler can completely
eliminate and/or collapse a resumable function call or sequence of
such calls where appropriate. If you have an atomic in there, it has
no choice but to do the atomic calls needlessly.

I think memory allocation is unavoidable for shared_future, or at
least any realistically close to the STL implementation of one. But a
future I think can be both STL compliant and never allocate memory
and be optimisable out of existence.

> * The wait strategy only deals with 'event' objects, which act as a
> bridge between the future and the promise.
>
> The event object is really my the core of my contribution; it can be
> thought of as the essence of future<void>::then; alternatively it can
> be seen as a pure user space synchronization primitive.

Exactly as my C11 permit object is. Except mine allows C code and C++
code to interoperate and compose waits together.

> Other features of this implementation:
>
> * Other than in the blocking strategy itself, the future and promise
> implementation have no sources of blocking (no mutexes, not even spin
> locks).
>
> * The shared state is not reference counted.
>
> To demonstrate the generality of the waiting concept, I have
> implemented a few waiter objects.
>
> * cv_waiter: this is simply a waiter on top of an std::mutex + cond var.
>
> * futex_waiter (linux): an atomic counter + a futex, possibly more
> efficient than the cv_waiter
>
> * sem_waiter (posix): a waiter implemented on top of a posix
> semaphore. More portable than the futex waiter and possibly more
> efficient than the cv_waiter
>
> * fd_waiter(linux, possibly posix): a waiter implemented on top of
> linux eventfd (for portability it can also be implemented on top of a
> pipe); you can use select with futures!
>
> * task_waiter: a completely userspace based coroutine waiter which
> switches to another coroutine on wait and resumes the original
> coroutine on signal.
>
> * scheduler_waiter: another coroutine based waiter, but on top of an
> userspace task scheduler. On wait switch to the next ready task, on
> signal enqueue the waiting task to the back of the ready queue.
>
> I know that there is a plan to reimplement a new version of
> boost::threads, hopefully this implementation can contribute a few
> ideas.

My current best understanding of Vicente's plans is that each thread
has a thread local condvar. The sole cause of a thread sleeping,
apart from i/o, is on that thread local condvar. One therefore has a
runtime which keeps a registry of all thread local condvars, and can
therefore deduce the correct thread local condvars to wake when
implementing a unified wait system also compatible with
Fibers/resumable functions.

I haven't played yet with proposed Boost.Fiber (this summer after C++
Now I will), but I expect it surely uses a similar runtime. Ideally
I'd like Thread v5's runtime to be equal to the Fiber runtime if this
is true, but as I mentioned I haven't played with it yet.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk