Boost logo

Boost :

Subject: Re: [boost] [gsoc-2013] Boost.Thread/ThreadPool project
From: Vicente J. Botet Escriba (vicente.botet_at_[hidden])
Date: 2013-04-30 02:40:07

Le 29/04/13 21:35, Niall Douglas a écrit :
>>> There are many ways systemically of implementing futures. The way you
>>> and Anthony have taken to date is a non-intrusive approach such that
>>> futures stand alone. This is great for not forcing design onto third
>>> parties and is a decentralised POSIX-y approach, but it comes with
>>> significant performance and memory costs because it's hard to get
>>> standalone, non-intrusive futures to batch themselves without
>>> sacrificing performance.
>> I have no merit here. This is the approach the standard has defined
> futures.
> No major changes to the C++11 futures API is needed. Rather that a
> different, but compatible, optimized implementation of future<> gets
> returned when used from a central asynchronous dispatcher. So the C++
> standard remains in force, only the underlying implementation is more
> optimal.

Could you explain me what a central asynchronous dispatcher is? Is not a
single thread worker thread?
I yes, I agree that futures that can be used only on a single thread
dispatcher can be optimized, but how the type system forbids to use them
Wouldn't this raise the same issues than async? Note that a different
future has been created as the result of async.

If not, please could you elaborate what kind of optimizations can be
>>> An alternative approach is to assume a central dispatcher/event loop
>>> which can optimize then() chaining because it knows of all futures and
>>> their relationships currently in flight (this is a much more
>>> centralized statist NT-y approach, and it you can see that philosophy
>>> clearly in Microsoft's N3428). You can see an example of this approach
>>> to implementing future chaining with multiple dependencies which
>>> extends Boost.ASIO in
>>> _file_ io.hpp#L785. I also went ahead and implemented N3428's
>>> when_all() which returns a future chained to multiple input futures,
>>> not that I claim the implementation mature yet.
>> I don't know ASIO enough to understand the advantages or liabilities.
>> While a central dispatcher/event loop could be useful for some
> applications
>> other need more concurrency.
>> I'll need to take a deep look at ASIO when I would have enough time.
> I know you'll have studied Microsoft's N3428 in depth. Their proposal is
> quite trivial to implement when you have an NT kernel to hand where the
> kernel directly manages all multithreading primitives and there is a top
> quality APC implementation.
> Implementing N3428 without a NT kernel to hand is very considerably harder,
> indeed there are many, many gotchas as I know you know. Even then() is
> tricky, because code called from then() needs two main call forms,
> asynchronous and synchronous (i.e. the original object completes only after
> type B then() completes, whereas type A then() lets the original complete
> first). Not introducing race conditions is quite hard (for me at least).
Good point. I have suggested to pass the value type reference to then
instead of the future. This should solve the issue.

Not introducing race conditions is quite hard in general. This is why libraries must provide interfaces that help the user to avoid them.

>>> My concern with Boost.ThreadPool right now is that it could get in the
>>> way of the latter approach. I have zero objection to a formulation
>>> which embraces both approaches, but when you ask why a thread pool
>>> isn't in the standard, I think it's because until we've fixed futures
>>> implementation and all the composibility and chaining we're sure we
>>> need, and made sure that interops smoothly with the forthcoming TR2
>>> networking proposal, then and only then is it wise to proceed with
>>> threadpool design.
>> I'm not aware of the TR2 networking proposal needs respect to futures or
>> thread pool. Again I'll take a look to the current proposals.
> Futures come in because you'd naturally use std::packaged_task<> with
> Boost.ASIO. It returns a future.
Could you point me to the Networking paper proposal that has
packaged_task<> on its user interface. I would expect this to be an
implementation detail.

And future<>?
>>> All that said, this is purely a cautious opinion of mine. You're one
>>> of the Boost.Thread maintainers, so I happily defer to your judgment
>>> on this. If you think we need a threadpool now, you'll find no
>>> opposition from me.
>> If IIUC you are not suggesting that the student uses the ASIO approach,
> but that
>> it is not time to solve the thread_pool now. Could you confirm?
>> If yes, could you tell me what is blocking that we can not try to solve?
> We ought to finish N3428 for pure POSIX first. During implementation, you
> will find many areas where significant optimization using a central
> dispatcher become possible. Specialisations of N3428 for some arbitrary
> central dispatcher might be worth prototyping as a thought exercise. After
> that I see no reason to hesitate on thread_pool as it now has the type
> inferences available to it to have an optimal design.
I could understand your optimization concerns and I hope you would be
able to get what you want.

>> I was not aware that there was a AFSIO GSoC project as described here
>> and that this
>> project propose also thread pool. I understand better now your concerns.
> I hadn't announced it to this list for various reasons. Hopefully I'll have
> news there soon.
> For reference, the AFSIO/AFIO project is *not* a threadpool. It's a batch
> asynchronous execution engine based on Boost.ASIO that lets you chain, in
> vast numbers, huge arrays of std::function<> whose returns are fetched using
> std::future<> to be executed asynchronously according to specified
> dependencies e.g. if A and B and C, then D, then E-G. That sort of thing.
So the thread pool you need is an internal one that is adapted to you
particular needs, isn't it?
> One thing presently implemented on that engine is asynchronous file i/o, but
> in the next month or two you'll hopefully see batch parallel SHA256 using
> 4-SHA256 SSE2 and NEON implementations also added to the asynchronous
> engine. The idea is that the engine is fairly generic for anywhere where you
> do need to chain lots of coroutine type items together (not that it supports
> Boost.Coroutine yet). v1 isn't particularly generic nor optimal, but I'm
> hoping with feedback from Boost that v2 in a few years' time would be much
> improved.
As Oliver noted you could take a look at Boost.Fiber and Boost.Task.


Boost list run by bdawes at, gregod at, cpdaniel at, john at