Boost logo

Boost :

Subject: [boost] [Tokenizer]Usage and documentation
From: Max (more4less_at_[hidden])
Date: 2011-02-08 08:13:21


Hello,

 

I'm using boost::tokenizer to do some simple parsing of data file in a
format specified by the following rules:

 

- One record of several fields in a single line

- Adjacent data fields in a record separated by space char's(space
or tab), with or without ","

- String without space(s), with or without quotation marks

- String with space(s), with quotation marks

 

One example of a 4-field-per-record file is like:

 

"string 2" 3 4 5 4.3

"String", 2, 3.04 4 3

AnyOtherText, 2, 3.04 4 3

 

 

I am using the following code to get a line at first, supposing 'input' has
the contents of the data file:

 

typedef boost::tokenizer<boost::char_separator<char> > tokenizer;

boost::char_separator<char> sep("\n", " ");

tokenizer tokens(input, sep);

for(tokenizer::iterator beg=tokens.begin(); beg!=tokens.end(); ++beg)

{

}

 

Then for each *beg, I parse each line with this

 

typedef boost::tokenizer<boost::char_separator<char> > tokenizer;

tokenizer tokens (*beg, boost::char_separator<char>(", "));

tokenizer::iterator it= tokens.begin();

 

But I cannot get the expected output. And, at the mean time, I found the doc
of boost::tokenizer quite slim and not easy to find the information that I
need.

 

Does anybody else have the same feeling, or, is the fact that nobody is
actually using it but turning to any other better lib?

 

Thanks for any help.

 

B/Rgds

Max

 

 

 


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk