Subject: Re: [boost] [Fibers] Performance
From: Hartmut Kaiser (hartmut.kaiser_at_[hidden])
Date: 2014-01-15 16:02:53
> > > Like any C++ probably Boost.Fiber makes many malloc calls per
> > > context switch. It adds up.
> > I don't think that things like a context switch require any memory
> > allocation. All you do is to flush the registers, flip the stack
> > pointer, and load the registers from the new stack.
> A fiber implementation would also need to maintain work and sleep queues.
> They're all STL containers at present.
I can see that. You explicitly referred to the context switch, thus my
request for clarification.
> > > I think coming within 50% of the performance of Windows Fibers would
> > > be more than plenty. After all Boost.Fiber "does more" than Windows
> > It might be sufficient for you but not for everybody else. It wouldn't
> > be sufficient for us, for instance. If you build systems relying on
> > fine grain parallelism, then efficiently implemented fibers are the
> > only way to go. If you need to create billions of threads (fibers),
> > then every microsecond of overhead counts billion-fold.
> Firstly I think you underestimate how quick Windows Fibers are - they have
> been highly tuned to win SQL Server benchmarks. Secondly, Boost.Fiber does
> a ton load more work than Windows Fibers, so no one can reasonably expect
> it to be as quick.
Whatever the speed of Boost.Fiber, all I would like to see is a measure of
its imposed overheads which would allow everybody to decide whether the
implementation is sufficiently performing for a particular use case. That's
what I was asking for in the very beginning. At the same time, our own
implementation in HPX (on the Windows platform) is using Windows Fibers for
our lightweight thread implementation, so I perfectly understand what's
their imposed overheads are.
I also understand that Boost.Fiber does more than the Windows Fibers which
are used just for the underlying context switch operation. Still, my main
incentive for voting YES to this review and for considering using this
library as a replacement for HPX's thread implementation would be if it had
superior performance. This is even more true as I know (and have evidence)
that it is possible to come close to the Windows Fibers performance for
lightweight threads exposing the same API as std::thread does (see HPX).
IMHO, Boost.Fiber is a library which - unlike other Boost libraries - has
not been developed as a prototype for a particular API (in which case I'd be
all for accepting subpar performance). It clearly has been developed to
provide a higher performing implementation for an existing API. That means
that if Oliver is not able to demonstrate superior performance over existing
implementations, I wouldn't see any point in having the library in Boost in
the first place.
> > > If I ever had a willing employer, I could get clang to spit out far
> > > more malloc optimal C++ at the cost of a new ABI, but I never could
> > > get an employer to bite.
> > Sorry for sidestepping, are you sure compilers do memory allocation as
> > part of their way to conform to ABI's? I was always assuming memory
> > allocation is done only when explicitly requested by user code.
> This is very off topic for this mailing list. However, one of the projects
> I proposed at BlackBerry before I was removed was to solve the substantial
> Qt allocation overhead because of PIMPL by getting clang to replace much
> use of individual operator new's for temporary objects with a single
> alloca() at the base of the call stack. This broke ABI because you need to
> generate an additional copy of every constructor, one which uses the new
> purely stack based allocation mechanism for temporary dynamic memory
> allocations (also, we'd need to spit out additional metadata to help the
> link and LTCG layer assemble the right code). Anyway the idea was deemed
> too weird to see any business case, and then of course I was eliminated
> shortly thereafter anyway. I should mention that this idea was one of mine
> long before joining BlackBerry, and therefore nothing proprietary is being
Thanks for this explanation.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk