Subject: Re: [boost] [gsoc-2013] Boost.Thread/ThreadPool project
From: Niall Douglas (ndouglas_at_[hidden])
Date: 2013-04-29 15:35:02
> > There are many ways systemically of implementing futures. The way you
> > and Anthony have taken to date is a non-intrusive approach such that
> > futures stand alone. This is great for not forcing design onto third
> > parties and is a decentralised POSIX-y approach, but it comes with
> > significant performance and memory costs because it's hard to get
> > standalone, non-intrusive futures to batch themselves without
> > sacrificing performance.
> I have no merit here. This is the approach the standard has defined
No major changes to the C++11 futures API is needed. Rather that a
different, but compatible, optimized implementation of future<> gets
returned when used from a central asynchronous dispatcher. So the C++
standard remains in force, only the underlying implementation is more
> > An alternative approach is to assume a central dispatcher/event loop
> > which can optimize then() chaining because it knows of all futures and
> > their relationships currently in flight (this is a much more
> > centralized statist NT-y approach, and it you can see that philosophy
> > clearly in Microsoft's N3428). You can see an example of this approach
> > to implementing future chaining with multiple dependencies which
> > extends Boost.ASIO in
> > https://github.com/ned14/TripleGit/blob/master/triplegit/include/async
> > _file_ io.hpp#L785. I also went ahead and implemented N3428's
> > when_all() which returns a future chained to multiple input futures,
> > not that I claim the implementation mature yet.
> I don't know ASIO enough to understand the advantages or liabilities.
> While a central dispatcher/event loop could be useful for some
> other need more concurrency.
> I'll need to take a deep look at ASIO when I would have enough time.
I know you'll have studied Microsoft's N3428 in depth. Their proposal is
quite trivial to implement when you have an NT kernel to hand where the
kernel directly manages all multithreading primitives and there is a top
quality APC implementation.
Implementing N3428 without a NT kernel to hand is very considerably harder,
indeed there are many, many gotchas as I know you know. Even then() is
tricky, because code called from then() needs two main call forms,
asynchronous and synchronous (i.e. the original object completes only after
type B then() completes, whereas type A then() lets the original complete
first). Not introducing race conditions is quite hard (for me at least).
> > My concern with Boost.ThreadPool right now is that it could get in the
> > way of the latter approach. I have zero objection to a formulation
> > which embraces both approaches, but when you ask why a thread pool
> > isn't in the standard, I think it's because until we've fixed futures
> > implementation and all the composibility and chaining we're sure we
> > need, and made sure that interops smoothly with the forthcoming TR2
> > networking proposal, then and only then is it wise to proceed with
> > threadpool design.
> I'm not aware of the TR2 networking proposal needs respect to futures or
> thread pool. Again I'll take a look to the current proposals.
Futures come in because you'd naturally use std::packaged_task<> with
Boost.ASIO. It returns a future.
> > All that said, this is purely a cautious opinion of mine. You're one
> > of the Boost.Thread maintainers, so I happily defer to your judgment
> > on this. If you think we need a threadpool now, you'll find no
> > opposition from me.
> If IIUC you are not suggesting that the student uses the ASIO approach,
> it is not time to solve the thread_pool now. Could you confirm?
> If yes, could you tell me what is blocking that we can not try to solve?
We ought to finish N3428 for pure POSIX first. During implementation, you
will find many areas where significant optimization using a central
dispatcher become possible. Specialisations of N3428 for some arbitrary
central dispatcher might be worth prototyping as a thought exercise. After
that I see no reason to hesitate on thread_pool as it now has the type
inferences available to it to have an optimal design.
> I was not aware that there was a AFSIO GSoC project as described here
> https://svn.boost.org/trac/boost/wiki/SoC2013#Boost.ASIO and that this
> project propose also thread pool. I understand better now your concerns.
I hadn't announced it to this list for various reasons. Hopefully I'll have
news there soon.
For reference, the AFSIO/AFIO project is *not* a threadpool. It's a batch
asynchronous execution engine based on Boost.ASIO that lets you chain, in
vast numbers, huge arrays of std::function<> whose returns are fetched using
std::future<> to be executed asynchronously according to specified
dependencies e.g. if A and B and C, then D, then E-G. That sort of thing.
One thing presently implemented on that engine is asynchronous file i/o, but
in the next month or two you'll hopefully see batch parallel SHA256 using
4-SHA256 SSE2 and NEON implementations also added to the asynchronous
engine. The idea is that the engine is fairly generic for anywhere where you
do need to chain lots of coroutine type items together (not that it supports
Boost.Coroutine yet). v1 isn't particularly generic nor optimal, but I'm
hoping with feedback from Boost that v2 in a few years' time would be much
--- Opinions expressed here are my own and do not necessarily represent those of BlackBerry Inc.