
On Sun, Aug 29, 2010 at 5:34 AM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
On 29/08/2010 00:53, Dave Abrahams wrote:
I presume you're familiar with http://lafstern.org/matt/segmented.pdf?
I had heard of segmented iterators but hadn't looked at them in detail, so thanks for the link. I'm not sure I want something that's just two-level
Read the paper. It works for as many levels as you want. The localiterator can itself be segmented.
I see some potential for some interesting discussion here, sorry in advance for hijacking this thread.
I think the whole segmented iterators/ranges is a bad idea, as it has quite a few disadvantages and there are alternative, simpler solutions that provide the same benefits.
segmented ranges have the following disadvantages:
- while segmented iterators are easier to write than flat iterators in certain cases, they're still not as easy to write as they could be
I don't think anyone is suggesting replacing flat iterators with segmented iterators -- they would coexist. I don't see anything wrong with letting generic algorithms (like for_each) transparently take advantage of the underlying storage of a sequence.
- making non-trivial operations (any operation that needs to manipulate several adjacent elements [1]) deal with segmented ranges can be so complicated that they can realistically only be written with a flattened non-segmented view of the range.
[1] examples include character encoding conversion, base64 encoding/decoding, gzip encoding/decoding, etc.
While developing a HTTP parser, I wanted to be able to parse between two or more buffers (segments), but that was quite slow. I wanted full performance for the 99% of the time that a valid sequence would exist in a single buffer. I wanted to avoid complicated code to move between segments, and didn't want to copy buffers around. I ended up letting the parser keep taking flat iterators as input -- as you say, clearly these are the simplest kind to work with. On top of that I built another parser that first uses a parser<const char*> as far as it can go on the first segment, and if that doesn't have enough data to return something, it uses a parser<flat_iterator> on as little of the full sequence as possible. Worked wonders, required very little code to do, and kept the core parsing logic simple. Granted HTTP is not all things, and base64/character encoding conversion/gzip are not all things, but I think this way would work fine for at least those. I don't think it's right to say "the whole segmented iterators/ranges is a bad idea" -- I can see plenty of applications where writing for segmented iterators *would* get overly complicated, but I don't see any reason why we'd be forced to use them if something better was available. -- Cory Nelson http://int64.org