|
Boost Users : |
From: Rush Manbert (rush_at_[hidden])
Date: 2006-07-24 15:10:26
Seth Nielson wrote:
> Hi
>
> Two questions related to string tokenization.
>
> 1. Which is preferred? using "split" or the "tokenizer class"?
> 2. Both of these methods seem geared towards splitting on characters
> rather than splitting on substrings. Is there yet another method that is
> preferred for splitting a string on an exact substring? If I want to
> split "I<mark>Am<mark>A<mark>Test" into I, Am, A, Test, what is the best
> way? It seems that for split I'll have to write my own predicate, and
> for tokenizer, I'll have to write my own tokenizerFunction.
>
For split(), and simple <mark> cases, you can use existing predicates.
For instance:
boost::split (splitVec, submitData, boost::algorithm::is_any_of (","));
which makes it very simple to tokenize a string. I have used this
approach in a multi-level parsing algorithm. I don't know how the
performance stacks up against other approaches, but it serves my purpose
and I can still understand it when I go back 6 months later and look at
it again. :-)
- Rush
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net