Boost logo

Boost Users :

From: Dominique Devienne (ddevienne_at_[hidden])
Date: 2021-02-09 13:19:59


Hi,

I'm using Boost.ASIO, not for networking, but simply to parallelize work
items,
with a simple graph of "processing nodes" as detailed below. It's working
fine,
but uses too much memory. I'd like insights on limiting memory use via
"throttling",
or what I've also seen called "back-pressure".

At a high level, I process two files (A and B) composed of several
"entries" each,
extracting a subset or a transformation of those entries, that I then write
into an output file (C).
(those 3 files reach into the many GBs, thus the need for parallelism, and
limiting memory use).

Extracting entries from A and B are independent operations, implemented
single-threaded,
producing independent work items (for subsetting or transforming each
item), A#1...A#n, and B#1..B#m.
That's the "fan-out" part, with each work-item (task) scheduled on any
thread of the ASIO pool, since independent.

Writing to C is also single-threaded, and needs to "fan-in" the work posted
by the A#n and B#m functors,
and I serialize that via posting to a C-specific strand (still on any
thread, as long as serialized, doesn't matter).
Lets call all those tasks writing to C the C#n+m tasks, which are posted to
the strand via the A#s and B#s.

My issue is that Boost.Asio seems to schedule an awful lots of A# and B#
tasks, before getting to the C# tasks,
which results in accumulating in memory too many C# tasks, and thus using
too much memory.

I don't see a way to force more C "downstream" tasks to be scheduled,
before processing so much A and B tasks,
and accumulating pending C tasks in the work queue, thus using too much
memory again.

Could someone please recommend a way to have a more balanced flow of tasks
in the graph?
Or alternative designs even, if what I do above is not ideal. Is Boost.Asio
even suitable in this case?

Thanks, --DD



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net