From: David Abrahams (dave_at_[hidden])
Date: 2004-04-06 19:17:23
1. http://www.boost.org/libs/tokenizer/tokenizer.htm says:
class TokenizerFunc = char_delimiters_separator<char>,
class Iterator = std::string::const_iterator,
class Type = std::string
Yet char_delimiters_separator is officially deprecated. Is that
really intentional? Wow, it appears to be using the deprecated
class template for the default!
Now, I wanted to tokenize an input stream, without putting it in a
string first. It seems to be much harder than neccessary:
typedef std::map<std::string, unsigned> fmap;
// Seems awfully complicated
using namespace boost::lambda;
std::for_each(t.begin(), t.end(), ++var(f)[_1]);
for (fmap::iterator p = f.begin(), e = f.end(); p != e; ++p)
std::cout << p->second << ": " << p->first << "\n";
I can think of lots of ways to simplify the interface, most of
which center on eliminating redundant mentions of
When I throw the following text at it:
how much wood could a woodchuck chuck,
if a woodchuck could chuck wood?
as desired. But if I replace char_delimiters_separator with
char_separator, I get:
What's up with that??
Even if char_separator did what it was advertised to (and it's
not clear that it does), it wouldn't give me the simple "find the
words functionality" of char_delimiters_separator... so I'm
baffled by the deprecation.
2. http://www.boost.org/libs/tokenizer/char_separator.htm says:
The function std::isspace() is used to identify dropped
delimiters and std::ispunct() is used to identify kept
delimiters. In addition, empty tokens are dropped.
which seems strange in light of the fact that there's no ctor
taking _functions_ to be used to determine kept/dropped
delimiters, and nowhere in the text do you indicate that functions
are called internally.
-- Dave Abrahams Boost Consulting www.boost-consulting.com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk