|
Boost : |
From: jeff_at_[hidden]
Date: 2001-06-01 10:35:02
I have reviewed Tokenizer and believe it should be accepted into boost. I used
g++ 2.95.2 under Cygwin for testing. I have some minor comments. I apologize
if these overlap with other reviewer comments, but I recently have not had time
to follow the boost list closely.
Code comments / questions:
1) I would have expected that the Tokenizer class would have TokenizerFunc as
the first template parameter and the input types (eg: string and
string::const_iterator) would be second and third allowing for defaults:
template <class TokenizerFunc,
class Token=std::string,
class Iterator=std::string::const_iterator>
class Tokenizer {...}
Then the third example could be simplified from:
typedef tokenizer<string,string::const_iterator,punct_space_separator<char> >
Tok;
to
typedef tokenizer<punct_space_separator<char> > Tok;
In my mind this is much clearer.
2) Is it possible to use a raw cstring instead of std::string as the input? I
experimented a bit, but I was unsuccessful in getting this to work. I have a
use for this in an application which needs to tokenize data from a socket
connection which is returned as a cstring and I wouldn't want the overhead of an
extra string construction / string copy.
3) Dependencies
This is just a point of information, I don't expect the library to change. I
used an earlier version of tokenizer and it required only the tokenizer headers,
utility.hpp, and config.hpp. The new version requires over 20 boost headers
including detail and type_traits directories. This is apparently the cost of
converting to iterator_adaptors.
Documentation Comments:
1) mainpage
a) Add a sentence in the summary which describes and directly links the
csv_separator, offset_separator, and punct_space_separator examples. As a user
of the library, these concrete example pages are the first thing I want to read
and they are currently a buried at the very end of the TokenizerFunction page.
b) a link to iterator adapter page in second sentence would be nice.
c) put the "convenience iterators" first in the examples
d) A short description of typical steps involved in usage would help explain
the library. Something like:
Typical steps in creating a custom tokenizer are to write a
TokenizerFunc(tor) which provides and operator() and a reset function.
2) The tokenizer policy documentation page
a) The 3rd sentence "The punct_space_separator function object..." seems out
of place.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk