Boost logo

Boost Users :

From: Thore Karlsen (sid_at_[hidden])
Date: 2005-06-12 15:45:11


On Sat, 11 Jun 2005 12:52:56 -0500, "Tom Browder" <tbrowder_at_[hidden]>
wrote:

>I have used my own C++ tokenizer in the past, but I would like to use
>Boost's instead.
>
>The predominant use of tokenizing for me is to split on white space, but
>Boost's default is to use white space AND punctuation. Is there any
>possibility to have either the default changed, or another TokenizerFunction
>added such as ws_separator, or something similar?
>
>I know I can use
>
> boost::char_separator<char> sep(" \n\t");
>
>(but do I need to add "\v" to the char set?)
>
>but I would rather have something like
>
> boost::ws_separator sep;
>
>and, better, make the ws_separator be the default TokenizerFunction for
>tokenizer.

Have you looked at the string_algo library? I much prefer its split
functionality to the tokenizer library, and what you want here is very
easy to accomplish with it.

Example:

  vector<string> v;
  split(v, "split me into tokens", is_space(), token_compress_on);

You should really check this library out. It's got a ton of useful
stuff.

-- 
Be seeing you.

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net