Boost logo

Boost :

Subject: Re: [boost] [Fibers] Performance
From: Hartmut Kaiser (hartmut.kaiser_at_[hidden])
Date: 2014-01-14 09:03:13

> > It would be interesting to see this number when giving 1..N cores to
> > the scheduler.
> > Things like contention caused by the
> > work stealing or by NUMA effects such when you start stealing across
> > NUMA domains usually overshadow the memory allocation costs.
> > Additionally, the quality of the scheduler implementation affects things
> gravely.
> > You might want to compare the
> > performance of your library with other existing solutions (for
> > instance TBB, qthreads, openmp, HPX). The link I provided above will
> > give you a set of trivial tests for those. Moreover, we'd be happy to
> > add an equivalent test for your library to our repository.
> after re-reading I have the the impression that there is a
> misunderstanding.

I hope not.

> boost.fiber is a thin wrapper over coroutines (each fiber contains on
> coroutine)
> - the library schedules and synchronizes fibers (as requested on the
> developer list in 2013) in one thread.
> the fibers in this lib are agnostic of threads - I've only added some
> support that the classes (mutex, condition_variable) could be used in a
> multi-threaded context.
> combining fibers with threads should be done in another, more
> sophisticated library (at higher level).
> I believe you can't and shouldn't compare fibers with qthreads, TBB or
> openmp.
> I'll write a test measuring the overhead of a fiber running in one thread
> (as already described above) first.

I beg to disagree. Surely, you run fibers on top of OS-threads (in your case
using the coroutines mechanism). However, every fiber is semantically
indistinguishable from a std::thread (if implemented properly). It has a
dedicated function to execute, it represents a context of execution, you can
synchronize it with other fibers, etc. In fact nothing in the C++ Standard
implies that a std::thread has to be implemented using OS (kernel) threads,
why we decided to name our lightweight tasks 'hpx::thread' which expose 100%
of the mandated interface for std::threads.

If you run on several cores (OS-threads), you start executing your fibers
concurrently. AFAIU, your library is clearly designed for this, otherwise
you wouldn't have implemented special, fiber-oriented synchronization
primitives or work stealing capabilities.

To clarify, I'm not talking about measuring the performance of (kernel)
threads, rather I would like for you to give us performance data for
Boost.Fiber so we can understand what are the overheads imposed by using
fibers in the first place.

The only way to not only get quantitative numbers which do not mean anything
beyond a single machine, I was suggesting to run equivalent performance
benchmarks using other, similar libraries, such a TBB, openmp, HPX, etc. as
this would allow to get a qualitative picture regardless of the machine the
tests are run on. And the libraries I listed clearly implement a
semantically equivalent idiom: lightweight parallelism (be it a task in TBB,
a fiber in Boost.Fiber, a hpx::thread, or a qthread, etc.).

Hope this clarifies what I had in mind.
Regards Hartmut

Boost list run by bdawes at, gregod at, cpdaniel at, john at