Boost logo

Boost :

From: Corwin Joy (cjoy_at_[hidden])
Date: 2001-06-21 20:52:00


I had a question about the tokenizer class.
Let's say I tokenize a string as given in the example for the library:

const string test_string = "This,,is, a.test..";
 // Use the convenience token_iterator
 typedef
token_iterator<string,string::const_iterator,punct_space_separator<char> >
TokType;
 TokType begin(test_string.begin(),test_string.end()), end;
 copy(begin, end,ostream_iterator<string>(cout,"|"));
 cout << "\n";

Output:
This|is|a|test|

Often, in addition to the tokens, what I want to know is the 'inverse' of
the tokens - i.e. what was the set of whitespace between the tokens.

Desired Output: Inverse of the tokens - i.e. the token seperators
,,|, |.|..|

is there any way to get this out of the tokenizer and or a reasonably easy
way to adapt what is in there?

Corwin


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk