Boost logo

Boost :

From: Pavol Droba (droba_at_[hidden])
Date: 2003-06-20 12:33:08


Hi,

I have no comment about the tokenize library, but if your are interested
in the stuff like that, you can have a look into the sandbox.

string_algo library already contains this functionality
( along with other interesting features ) and it is implemented in more generic way.

Documentation is not updated yet ( please be patient ), but you can have
a look into tests to examples how the framework works.

Check $sanbox$/boost/string_algo/split.hpp ( and related headers )
for algorithms
and $sandbox$/boost/string_algo/classification.hpp for supporing functors.

$sandbox$/libs/string_algo/test/iterator_test.hpp contains some usage examples.

Regards,
Pavol.

PS:

Just a small comment. Library ( and especially split/tokenize part )
is dependant on Boost.iterator_adaptors. Unfortunately it is not yet ported
to the new version. Conclusion is, that you cannot build tests directly
from the sandbox.
If you are looking to use the lib, copy all the headers from string_algo subdir
( and/or cummulative headers string_algo.hpp and string_algo_regex.hpp )
into your boost tree. Everything should work fine then.

On Fri, Jun 20, 2003 at 03:45:28PM +0400, Vladimir Prus wrote:
>
> I have a few comments regarding the tokenizer library.
>
> 1. The documentation says that char_delimiters_separator is default parameter
> to 'tokenizer' template, and at the same time says that
> 'char_delimiters_separator' is deprecated. I think that's confusing and
> default parameter should be changed to 'char-separator'.
>
> 2. The token interator description is very brief. Specifically, it does not
> say what that iterator is usefull for, or when it's preferrable to direct use
> of tokenizer. The only way to construct the iterator is via
> make_token_iterator function which takes two interators as arguments. The
> meaning of those arguments is not documented.
>
> Lastly, the usage example
>
> typedef token_iterator_generator<offset_separator>::type Iter;
> Iter beg = make_token_iterator<string>(s.begin(),s.end(),f);
> Iter end = make_token_iterator<string>(s.end(),s.end(),f);
> for(;beg!=end;++beg){
>
> appears to be just longer than tokenizer use:
>
> typedef tokenizer< offset_separator > tok_t;
> tok_t tok(s, f);
> for(tok_t::iterator i = tok.begin(); i != tok.end(): ++i) {
>
> so I *really* wonder what this iterator is for. OTOH, if it could be used
> like:
>
> for(token_iterator< offset_separator > i(s, f), e; i != e; ++i) {
> }
>
> it would be definitely simpler and easier. Is something like this possible?
>
> 3. The 'escaped_list_separator' template could have default argument for the
> first parameter, "Char".
>
> 4. I almost always try to use tokenizer when values are separated by commas.
> Believe me or not, I'm always confused as to which tokenizer function to use.
> This time, I read all docs for char_separator and only then used escaped_list
> separator -- which does the work out of the box. Maybe, a different name,
> like "csv_with_escapes_separator" or "extended_csv_separator" would help?
> It would make immediately clear what this separator is for.
>
> - Volodya
>
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk