Boost logo

Boost Users :

From: Eric Hill (eric_at_[hidden])
Date: 2006-07-24 14:50:06


> Two questions related to string tokenization.
>
> 1. Which is preferred? using "split" or the "tokenizer class"?
> 2. Both of these methods seem geared towards splitting on characters
> rather than splitting on substrings. Is there yet another method that is
> preferred for splitting a string on an exact substring? If I want to
> split "I<mark>Am<mark>A<mark>Test" into I, Am, A, Test, what is the best
> way? It seems that for split I'll have to write my own predicate, and
> for tokenizer, I'll have to write my own tokenizerFunction.

Have a look at the Boost Spirit parser framework. You can define
arbitrarily complex grammar that can be decorated with actions to suit
your needs. The only thing that it doesn't have "out of the box" that I
need is short-circuit evaluation (such that "show", "sho", and "sh"
automatically map to the same action), but that's a very small thing
compared to the wonderful flexibility of a complete recursive descent
parser in C++.

Eric


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net