Boost logo

Boost :

Subject: [boost] [Tokenizer]Usage and documentation
From: Max (more4less_at_[hidden])
Date: 2011-02-08 08:13:21



I'm using boost::tokenizer to do some simple parsing of data file in a
format specified by the following rules:


- One record of several fields in a single line

- Adjacent data fields in a record separated by space char's(space
or tab), with or without ","

- String without space(s), with or without quotation marks

- String with space(s), with quotation marks


One example of a 4-field-per-record file is like:


"string 2" 3 4 5 4.3

"String", 2, 3.04 4 3

AnyOtherText, 2, 3.04 4 3



I am using the following code to get a line at first, supposing 'input' has
the contents of the data file:


typedef boost::tokenizer<boost::char_separator<char> > tokenizer;

boost::char_separator<char> sep("\n", " ");

tokenizer tokens(input, sep);

for(tokenizer::iterator beg=tokens.begin(); beg!=tokens.end(); ++beg)




Then for each *beg, I parse each line with this


typedef boost::tokenizer<boost::char_separator<char> > tokenizer;

tokenizer tokens (*beg, boost::char_separator<char>(", "));

tokenizer::iterator it= tokens.begin();


But I cannot get the expected output. And, at the mean time, I found the doc
of boost::tokenizer quite slim and not easy to find the information that I


Does anybody else have the same feeling, or, is the fact that nobody is
actually using it but turning to any other better lib?


Thanks for any help.







Boost list run by bdawes at, gregod at, cpdaniel at, john at