|
Boost : |
From: Corwin Joy (cjoy_at_[hidden])
Date: 2001-06-21 20:52:00
I had a question about the tokenizer class.
Let's say I tokenize a string as given in the example for the library:
const string test_string = "This,,is, a.test..";
// Use the convenience token_iterator
typedef
token_iterator<string,string::const_iterator,punct_space_separator<char> >
TokType;
TokType begin(test_string.begin(),test_string.end()), end;
copy(begin, end,ostream_iterator<string>(cout,"|"));
cout << "\n";
Output:
This|is|a|test|
Often, in addition to the tokens, what I want to know is the 'inverse' of
the tokens - i.e. what was the set of whitespace between the tokens.
Desired Output: Inverse of the tokens - i.e. the token seperators
,,|, |.|..|
is there any way to get this out of the tokenizer and or a reasonably easy
way to adapt what is in there?
Corwin
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk