Boost logo

Boost :

From: Thore Karlsen (sid_at_[hidden])
Date: 2005-01-29 12:44:36


On Sat, 29 Jan 2005 13:22:36 +0100, Pavol Droba <droba_at_[hidden]>
wrote:

>> >Slip is designed to not ingnore any token. Imagine that you need to
>> >parse comma delimited string. Even an empty string can be a valid. So
>> >this is the reason why your result starts with empty string. It's
>> >because your input starts with a separator.

>> Here is a case that doesn't seem to behave properly: Input ending with a
>> separator. E.g.:
>>
>> string s = ",a,";
>> vector<string> tokens;
>> split(tokens, s, is_punct(), token_compress_off);
>>
>> This results in a vector containing "" and "a", but not the final "".
>>
>> This asymmetrical behavior feels like a bug to me. Any thoughts?

>Hmm, your reasoning seem logical. The behaviour should not be asymmetric.
>Now the question is which way to go. If it is better to include trailing
>part, or to remove the leading one.
>
>I think, that including the trailing part is better. I will see how to fix it.

Sounds good. I also think it is better to include the trailing part.
That's how I would expect it to behave. It is also easier to trim the
string before splitting if they are not wanted, than to manually check
for separators at each end of the string and inserting the blank tokens
yourself.

However, I wonder if it would be easy or logical to add another
token_compress variation. I often find myself in the same situation as
the original poster, where I simply don't care about empty tokens. For
example, reading a line of text from a file and extracting words by
splitting on whitespace. If the line started with a tab, that would give
an empty word if I didn't trim it first. Trimming is easy enough, but
it's extra overhead, and not quite as convenient.

The library looks great, by the way. I originally wrote my own library
with similar functionality, but this is much more complete and flexible,
and it looks like I can throw away my own library now. :)

-- 
Be seeing you.

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk