Boost logo

Boost :

Subject: [boost] GSoC: Boost.Pipeline -- scheduling of segments
From: Benedek Thaler (thalerbenedek_at_[hidden])
Date: 2014-07-01 17:24:57


Hi

I'm working on the GSoC Pipeline project, which is based on the N3534 [1]
proposal. Work in progress can be found on GitHub [2]. A simple example of
using the pipeline:

    pipeline::from(input_container) | transformation1 | t2 | t3 |
output_container;

When running this pipeline, items will be read from `input_container`,
processed by the transformations and written to the `output_container`.
Each segment should be applied in a parallel manner, this is the point of
the pipeline.

However, scheduling of these works is not trivial. Quoting the proposal:

"The current pipeline framework uses one thread for each stage of the
pipeline. To limit the use of resources, it should be possible to run with
fewer threads, using work-stealing or work-sharing techniques."

We are wondering if we could improve on this. Lets assume a thread pool of
a single thread and the example pipeline above. On run(), the ideal would
be that the thread reads *some* items from `input_container`,
applies the transformations on them and pushes the results to
`output_container`. That is: spending some time on each transformation then
yield and pick up the next one.

However, doing this would imply the transformations are reentrant. This
additional constraint must be considered carefully. This behavior is
implemented in the `development` branch [2].

Aside the two solutions above, Vicente J. Botet Escriba coined in the
following idea: Let the threads work on a single transformation until the
queue gets closed (no more input), then move to the next one. This is easy
to implement and scales to as many threads as many segments are present. On
the other hand, it kills the performance of the "online usecase":

    pipeline::from(read_message_from_socket) | process_message |
send_message;

It's not deterministic when the first segment will end, the pipeline will
hang and won't provide any output. Also, in a slightly less strict
scenario, when the end of input is specified, the latency could be just too
high.

To summarize, the following options are on the table:

 1. Dedicate a thread to each segment (what to do with a fixed size
threadpool?)
 2. Constrain the transformations to be reentrant.
 3. Run each transformations until there is input to be processed, from
beginning to the end.

We are kindly asking the recipients to share their ideas or opinions,
ideally with a matching usecase.

Thanks,
Benedek

[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3534.html
[2]: https://github.com/erenon/pipeline/tree/development


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk