Boost logo

Boost :

From: David Abrahams (dave_at_[hidden])
Date: 2004-04-06 19:17:23

1. says:

    template <
        class TokenizerFunc = char_delimiters_separator<char>,
        class Iterator = std::string::const_iterator,
        class Type = std::string
      class tokenizer

    Yet char_delimiters_separator is officially deprecated. Is that
    really intentional? Wow, it appears to be using the deprecated
    class template for the default!

    Now, I wanted to tokenize an input stream, without putting it in a
    string first. It seems to be much harder than neccessary:

      #include <map>
      #include <string>
      #include <iostream>
      #include <boost/tokenizer.hpp>
      #include <boost/lambda/lambda.hpp>
      #include <iterator>

      int main()
          typedef std::map<std::string, unsigned> fmap;

          // Seems awfully complicated
> t(
             , std::istreambuf_iterator<char>()

          fmap f;
          std::string s;

          using namespace boost::lambda;

          std::for_each(t.begin(), t.end(), ++var(f)[_1]);

          for (fmap::iterator p = f.begin(), e = f.end(); p != e; ++p)
              std::cout << p->second << ": " << p->first << "\n";

    I can think of lots of ways to simplify the interface, most of
    which center on eliminating redundant mentions of

    When I throw the following text at it:
how much wood could a woodchuck chuck,
if a woodchuck could chuck wood?
    I get:

        2: a
        2: chuck
        2: could
        1: how
        1: if
        1: much
        2: wood
        2: woodchuck
    as desired. But if I replace char_delimiters_separator with
    char_separator, I get:


    What's up with that??

    Even if char_separator did what it was advertised to (and it's
    not clear that it does), it wouldn't give me the simple "find the
    words functionality" of char_delimiters_separator... so I'm
    baffled by the deprecation.

2. says:

      explicit char_separator()

      The function std::isspace() is used to identify dropped
      delimiters and std::ispunct() is used to identify kept
      delimiters. In addition, empty tokens are dropped.

   which seems strange in light of the fact that there's no ctor
   taking _functions_ to be used to determine kept/dropped
   delimiters, and nowhere in the text do you indicate that functions
   are called internally.



Dave Abrahams
Boost Consulting

Boost list run by bdawes at, gregod at, cpdaniel at, john at