Boost logo

Boost :

Subject: Re: [boost] [thread] Alternate future implementation and future islands.
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2015-03-19 15:51:05


On 19 Mar 2015 at 18:05, Giovanni Piero Deretta wrote:

> > Your future still allocates memory, and is therefore costing about
> > 1000 CPU cycles.
>
> 1000 clock cycles seems excessive with a good malloc implementation.

Going to main memory due to a cache line miss costs 250 clock cycles,
so no it isn't. Obviously slower processors spin less cycles for a
cache line miss.

> Anyways, the plan is to add support to a custom allocator. I do not think
> you can realistically have a non allocating future *in the general case* (
> you might optimise some cases of course).

We disagree. They are not just feasible, but straightforward, though
if you try doing a composed wait on them then yes they will need to
be converted to shared state. Tony van Eerd did a presentation a few
C++ Now's ago on non-allocating futures. I did not steal his idea
subconsciously one little bit! :)

> I understand what you are aiming at, but I think that the elidability is
> orthogonal. Right now I'm focusing on making the actual synchronisation
> fast and composable in the scenario where the program has committed to make
> a computation async.

This is fine until your compiler supports resumable functions.

> > Exactly as my C11 permit object is. Except mine allows C code and C++
> > code to interoperate and compose waits together.
>
> Not at all. I admit not having studied permit in detail (the doc size is
> pretty daunting) but as far as I can tell the waiting thread will block in
> the kernel.

It can spin or sleep or toggle a file descriptor or HANDLE.

> It provides a variety of ways on how to block, the user can't add more.

It provides a hook API with filter C functions which can, to a
limited extent, provide some custom functionality. Indeed the file
descriptor/HANDLE toggling is implemented that way. There is only so
much genericity which can be done with C.

> > My current best understanding of Vicente's plans is that each thread
> > has a thread local condvar. The sole cause of a thread sleeping,
> > apart from i/o, is on that thread local condvar. One therefore has a
> > runtime which keeps a registry of all thread local condvars, and can
> > therefore deduce the correct thread local condvars to wake when
> > implementing a unified wait system also compatible with
> > Fibers/resumable functions.
>
> That doesn't work if a program wants to block for example in select or
> spin on a memory location, or on an hardware register, or wait for a
> signal, or interoperate with a different userspace thread library, or some
> other event queue (asio, qt or whatever) etc. and still also wait for a
> future. Well, you can use future::then, but it has overhead.

I think where we are vaguely heading is that anything Boost.Coroutine
capable will convert blocking into coroutine scheduling. That way
Thread v5 programs if they block doing ASIO or AFIO i/o under the
bonnet it'll schedule other fibre work to do where possible, and any
condvar or mutex blocks could turn into ASIO/AFIO work. A bit like a
"userspace WinRT" I suppose.

But sure, C++ is not WinRT. If the programmer writes an infinite for
loop, he gets blocking behaviour.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk