Boost logo

Boost :

Subject: Re: [boost] Stacking iterators vs. dataflow
From: David Abrahams (dave_at_[hidden])
Date: 2008-09-03 09:16:20


on Wed Sep 03 2008, "Phil Endecott" <spam_from_boost_dev-AT-chezphil.org> wrote:

> I just noticed this in the "lifetime of ranges vs. iterators" thread (which I've
> not really been following):
>
> Arno Sch?dl wrote:
>> rng | filtered( funcA ) | filtered( funcB ) | filtered( funcC ) |
>> filtered( funcD ) | filtered( funcE )
>
> I thought it worth pointing out the similarity, and also the difference, between
> this and the proposed dataflow notation. Here, operator| is being used like a
> shell pipe operator. In dataflow, operator| has a quite different meaning: it's
> a vertical line, distributing the output of "rng" to the inputs of the funcs in
> parallel.

Very interesting.

> Confusing, perhaps?

Perhaps

> Anyway you could presumably write something like
>
> rng >>= funcA >>= funcB ....

For which library are you suggesting that notation?

> and I would be interested to hear how the two implementations compare.
> Is it true to say that stacked iterators implement a "data pull"
> style, while dataflow implements "data push"?

I believe that's correct.

> I also note that Arno wants to use stacked iterators because this alternative:
>
> result = fn1( fn2( fn3( fn4( huge_document ) ) ) );
>
> creates large intermediates and requires dynamic allocation.

Yes, that's one of the classic reasons for using iterator adaptors.

> Again, a framework that allowed buffering of "sensible size" chunks
> and potentially distributed the work between threads could be a good
> solution.

Yes, parallelizing operations on such structures is an interesting
problem. I think it may require the imposition of a segmented view over
even nonsegmented structures.

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk