Boost logo

Boost :

From: Daryle Walker (darylew_at_[hidden])
Date: 2000-09-13 00:19:54


on 9/9/00 3:22 AM, John R. Bandela at jbandela_at_[hidden] wrote:

> I have updated the Token Iterator and Tokenizer concepts. It is
> located in BoostUpdate3.zip in TokenIterator directory
>
> In addition, I have provided an implementation of those concepts
> along with several samples. The code has been reorganized. Here are
> some highlights of the update.
>
> 1. punct_space_tokenizer moved to separate file since there are many
> ways to implement it.

It is referenced in the main tokenizer.hpp file since (a version of)
punct_space_tokenizer is used as the default type in the second template
argument of the token_iterator class. I think token_iterator should be just
another sample, and not the default tokenizer. It's highly likely that some
other tokenizer will be used, so not including punct_space_tokenizer and its
owning header file means clients who don't use it won't pay for having it.

> 2. New template ptr_tokenizer_fun. This is analogous to ptr_fun in
> the STL and is used to turn a regular function into a Tokenizer

Maybe some sort of traits class should be added. The ptr_tokenizer_fun
would use that traits class, and so can any other potential tokenizer or
adapter.

> 3. Some of the sample tokenizers that don't need to maintain state
> informationn are now functions and use ptr_tokenizer_fun
>
> 4. New samples
> a) A simple tokenizers that filters out non-alpha characters
> b) A simple 4 function (no operator precedence) calculator
> c) A replacing tokenizer
>
> Of these, I believe c) is the most interesting. You give it a string
> to replace and what to replace it with, and when it is incremented
> and dereferenced, it will return the character that would have been
> there had a search/replace been done. However, it does it without
> modifying the orignal sequence. It is also capable of being used with
> input iterators. To do such, because it needs to scan ahead, it has
> to maintain a buffer. However, the max size of the buffer is the size
> of the string to replace with. Check it out and tell me what you
> think.

It choked on me, see below.

> Thanks to everyone who has taken the time to look at the code and
> concepts and provide feedback. Please let me know what you think of
> this update including concepts, code, and samples.

I can't compile the test code. It has a problem with default arguments.
The token_iterator (template) class has two constructors that can take a
constant reference of a tokenizer object as the final argument. This
argument is defaulted to the default value of that type. Unfortunately, my
compiler [Metrowerks CodeWarrior Pro 5 (with 5.3 update) for the Mac OS]
can't handle the template when the tokenizer type is not
default-constructable. Only one of the test samples has that property, but
the compiler doesn't even allow the work-around of explicitly giving the
last argument a specific value.

In other words:

//==========================================================================
class MyFirst
{
    MyFirst() {}
    //...
};

class MySecond
{
    MySecond( int i ) : i_(i) {}
    //...
}

//...

// This works
Iter x;
token_iterator<Iter, MyFirst> begin( x.begin(), x.end() ), end;

// This doesn't work, by choking on MySecond not having a default
// constructor, even though I don't try to use one!
Iter y;
token_iterator<Iter, MySecond> begin2( y.begin(), y.end(), MySecond(1) ),
                                end2( MySecond(0) );
//==========================================================================

-- 

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk