Subject: Re: [boost] [Tokenizer]Usage and documentation
From: Yechezkel Mett (ymett.on.boost_at_[hidden])
Date: 2011-02-10 04:40:59
On Thu, Feb 10, 2011 at 9:32 AM, Max <more4less_at_[hidden]> wrote:
[Stephan T. Lavavej <stl_at_[hidden]> wrote:]
>> > The part I could not interpret is:
>> > ^|[\s,]
>> > And
>> > $|[\s,]
>> The docs say:
>> > A '^' character shall match the start of a line.
>> > A '$' character shall match the end of a line.
> Yes, I'm aware of this. But even with this in mind, I cannot interpret
> "^|[\s,]" and "$|[\s,]".
> For the former, I know '|' means alteration, but how can it be after '^'?
> For the latter, how can "|[\s,]" be expected after the end of a line (and
> the same confusion as above)?
means _either_ the beginning of the line _or_ a space or comma. In
other words the field starts either at the beginning of the line or
after a space or comma.
The field ends either at the end of the line or before a space or comma.
> One more question - with you code, any empty 'token' between two contiguous
> ',' is ignored, what if someday I'd like to pick them up?
I'm presuming an empty line should count as no tokens; if you don't
mind an empty line being one token it can be simplified to
Not really that much simpler.