Boost logo

Boost :

Subject: Re: [boost] [RFC] string inserter/extractor "q u o t i n g"
From: Daniel James (dnljms_at_[hidden])
Date: 2010-06-23 14:51:01


On 23 June 2010 11:51, Stewart, Robert <Robert.Stewart_at_[hidden]> wrote:
>
>> To be honest, I don't see the value of this. As this is the
>> kind of thing which is handled well in other ways (e.g. using a
>> parser or lexer generator, or a standard data format such as
>> XML, JSON etc.).  There tends to be odd differences in quoting,
>> encoding and escaping styles making a generic function awkward.
>> It's not as specific as a filename extractor and not as generic
>> as a parser and it's not clear why there's a need for something
>> in between.
>
> Those other approaches are heavier than these algorithms

You often need to use some kind of parser just to get the quoted
string in the first place.

> which can serve simple cases quite well.

What are these simple cases?

I could see the use for something which reads and decodes a 'token'
following something like the shell grammar and sets the iterator to
the end of the token. But that's quite a specific and more complicated
grammar, rather than an attempt at a simple general one.

> If you'd care to enumerate the special cases to which you allude, we can consider how best to address them, if support is warranted.

Some examples are: supporting multiple delimiter characters (e.g.
supporting both 'x' and "x"), delimiters made up of multiple
characters (e.g., """x"""), delimiter pairs (e.g. {x}), meaningful
escapes (e.g. '\n' meaning newline), whether newlines are allowed
between quotes or if they should end the quoted string, how multiple
quoted strings are treated (e.g. in C whitespace separated quoted
strings are concatenated, in your algorithm the space between them is
included), whether the parsing should be strict or loose and if it is
loose, how should it recover from errors.

Daniel


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk