Boost logo

Boost :

From: Ben Young (ben.young_at_[hidden])
Date: 2002-12-12 04:25:36


On Wed, 11 Dec 2002, Beman Dawes wrote:

> At 07:09 AM 12/9/2002, Ben Young wrote:
>
> >I have recently implemented a set of custom output iterators that allow
> >you to very trivially do escaping and quoting on data from any kind of
> >iterator source...
>
> >...
>
> >Would anyone be interested in these iterators being added to boost? They
> >are fully generic, and have some test cases/example usage (though not
> much
> >more complicated than above)
>
> I think I am.
>
> Sorry to be ambivalent; I have certainly had the need, but never thought of
> custom iterators. So I'd want to be sure that was the best way to meet the
> need.
>
> Did you consider other approaches? Some form of stream customization, for
> example? In some ways the problem seems similar to the formatting issues
> for serialization/persistence libraries. Pros and cons of different
> approaches?
>

Thats ok. I'm kind of ambivalent about it as well :). Of course there are
as many ways to perform these kind of tasks as there are programmers. Here
we use our own custon stream classes called Automata which can be chained
together, but I ( well, one of my collegues, I am building on his work)
thought it would be nice to come up with something a little more generic.

For a while I though about using somethine lie a codecvt facet as escaping
can be considered a transformation from an external representation and
back again, but a) I haven't done much work with locales yet (only been a
professional for ~1.5 years) and b) I wanted something that could be
*easily* used in standard algorithms etc.

The output iterator approach is easy to use in many different situations
but has a number of flaws as far as I can see.

1) I can't think of a way to implement an iterator that does unescaping.
   I can get it to unescape a single escape sequence whereever it appears,
   but it appears to be impossible? to implement the chaining approach.
   This is because the iterator must hold onto the last few characters it
   has seen before deciding whether to output them or not. As
   output_iterators cannot now when the input has been exhausted there is
   no way of knowing when to "flush" these stored characters to the
   chained underlying iterator. I'm sure this can be worked out by people
   cleverer than me.

2) It can be slow. Anything other than trivial escaping is best performed
   by a custom approach.

Because of these flaws I am again warming to the codecvt approach. Perhaps
being able to do a lexical_cast from one string to another, with an
escaping locale enbued might be the way to go?

Thanks for the comments

Cheers

Ben

---
> --Beman
>
> PS: We need to move iterator related stuff out of utility and into a Boost
> Iterator library.
>
>
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
>
>

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk