Subject: Re: [boost] [Tokenizer]Usage and documentation
From: Yechezkel Mett (ymett.on.boost_at_[hidden])
Date: 2011-02-10 04:40:59
On Thu, Feb 10, 2011 at 9:32 AM, Max <more4less_at_[hidden]> wrote:
[Stephan T. Lavavej <stl_at_[hidden]> wrote:]
>> > The part I could not interpret is:
>> > ^|[\s,]
>> > And
>> > $|[\s,]
>> The docs say:
>> > A '^' character shall match the start of a line.
>> > A '$' character shall match the end of a line.
> Yes, I'm aware of this. But even with this in mind, I cannot interpret
> "^|[\s,]" and "$|[\s,]".
> For the former, I know '|' means alteration, but how can it be after '^'?
> For the latter, how can "|[\s,]" be expected after the end of a line (and
> the same confusion as above)?
means _either_ the beginning of the line _or_ a space or comma. In
other words the field starts either at the beginning of the line or
after a space or comma.
The field ends either at the end of the line or before a space or comma.
> One more question - with you code, any empty 'token' between two contiguous
> ',' is ignored, what if someday I'd like to pick them up?
I'm presuming an empty line should count as no tokens; if you don't
mind an empty line being one token it can be simplified to
Not really that much simpler.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk