Boost logo

Boost :

Subject: Re: [boost] [RFC] string inserter/extractor "q u o t i n g"
From: Rob Stewart (robertstewart_at_[hidden])
Date: 2010-06-23 21:59:43


On 6/23/2010 2:51 PM, Daniel James wrote:
> On 23 June 2010 11:51, Stewart, Robert<Robert.Stewart_at_[hidden]> wrote:
>>
>> Those other approaches are heavier than these algorithms
>
> You often need to use some kind of parser just to get the quoted
> string in the first place.
>
>> which can serve simple cases quite well.
>
> What are these simple cases?

CSV fields, pathnames, log messages.

>> If you'd care to enumerate the special cases to which you allude, we can consider how best to address them, if support is warranted.
>
> Some examples are: supporting multiple delimiter characters (e.g.
> supporting both 'x' and "x"), delimiters made up of multiple
> characters (e.g., """x"""), delimiter pairs (e.g. {x}), meaningful
> escapes (e.g. '\n' meaning newline), whether newlines are allowed
> between quotes or if they should end the quoted string, how multiple
> quoted strings are treated (e.g. in C whitespace separated quoted
> strings are concatenated, in your algorithm the space between them is
> included), whether the parsing should be strict or loose and if it is
> loose, how should it recover from errors.

Those are definitely cases that I didn't intend this algorithm to
cover except, perhaps, multiple delimiter characters and paired
delimiters, which I hadn't considered.

Semantic meaning is definitely domain specific as is the
treatment of multiple delimited substrings. In the latter case,
while simply removing the internal delimiters is legitimate, so
is just handling first and last characters and ignoring
delimiters in the rest.

When considered as the inverse of quote(), unquote() should
simply strip leading and trailing delimiters and look for escaped
delimiters and escaped escape characters within. To supply the
extra semantics you've suggested, quote() must also be enhanced
significantly.

___
Rob


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk