Boost logo

Boost :

Subject: Re: [boost] [Fibers] Performance
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2014-01-15 09:19:24


On 15 Jan 2014 at 7:18, Hartmut Kaiser wrote:

> > Like any C++ probably Boost.Fiber makes many malloc calls per context
> > switch. It adds up.
>
> I don't think that things like a context switch require any memory
> allocation. All you do is to flush the registers, flip the stack pointer,
> and load the registers from the new stack.

A fiber implementation would also need to maintain work and sleep
queues. They're all STL containers at present.

> > I think coming within 50% of the performance of Windows Fibers would be
> > more than plenty. After all Boost.Fiber "does more" than Windows Fibers.
>
> It might be sufficient for you but not for everybody else. It wouldn't be
> sufficient for us, for instance. If you build systems relying on fine grain
> parallelism, then efficiently implemented fibers are the only way to go. If
> you need to create billions of threads (fibers), then every microsecond of
> overhead counts billion-fold.

Firstly I think you underestimate how quick Windows Fibers are - they
have been highly tuned to win SQL Server benchmarks. Secondly,
Boost.Fiber does a ton load more work than Windows Fibers, so no one
can reasonably expect it to be as quick.

> > If I ever had a willing employer, I could get clang to
> > spit out far more malloc optimal C++ at the cost of a new ABI, but I never
> > could get an employer to bite.
>
> Sorry for sidestepping, are you sure compilers do memory allocation as part
> of their way to conform to ABI's? I was always assuming memory allocation is
> done only when explicitly requested by user code.

This is very off topic for this mailing list. However, one of the
projects I proposed at BlackBerry before I was removed was to solve
the substantial Qt allocation overhead because of PIMPL by getting
clang to replace much use of individual operator new's for temporary
objects with a single alloca() at the base of the call stack. This
broke ABI because you need to generate an additional copy of every
constructor, one which uses the new purely stack based allocation
mechanism for temporary dynamic memory allocations (also, we'd need
to spit out additional metadata to help the link and LTCG layer
assemble the right code). Anyway the idea was deemed too weird to see
any business case, and then of course I was eliminated shortly
thereafter anyway. I should mention that this idea was one of mine
long before joining BlackBerry, and therefore nothing proprietary is
being leaked.

Niall

-- 
Currently unemployed and looking for work.
Work Portfolio: http://careers.stackoverflow.com/nialldouglas/



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk