Boost logo

Boost :

From: Pavol Droba (droba_at_[hidden])
Date: 2002-11-26 05:16:04


On Mon, Nov 25, 2002 at 11:57:25AM -0500, Alexei Novakov wrote:
>
> "Pavol Droba" <droba_at_[hidden]> wrote in message
> news:20021122235345.J5713_at_lenin.felcer.sk...
> > Well, in the current state, the string_algo library provides a generic set
> > of string
> > related algorithm. There are many reasons I choose to support any sequence
> > not just the string.
> >
> Can you list these reasons? I have browsed string algorithm library
> discussion and found only one you've mentioned:
> <quote>
> Well I have reasons to put it in the way I did. Rationale was to allow usage
> of the
> functions in fuctional way. I consider it quite natural to write something
> like
> if ( to_lower(trim(s)) == "OK" )
> {
> // do somethign with original s
> }
> There are many cases like this in which you probably don't want modify input
> sequence.
> </quote>

Actually, this is notthe right point. The quote you have used was about the
reasonability of non-mutable variants.

> This is exactly why sub_string was introduced. If sub_string is returned
> master string will not be modified and new string will not be allocated.
>
> Are there other reasons?
>

All you are pointing out is that substring can be very well used with trims.
I agree, but this is a very special case.

So what are the reasons to allow any sequence type not just a string?

(i) Allow the usage of any string variant like string, wstring, vector<char> and etc.
         This also means to allow any user-defined extension of container to be used
         ( like you sub_string, const_string ) and also for instance rope container.

         With the new container_algo library which is under development, it would be
         possible to use even c-string as a parameter.

(ii) Allow to used different containers in one algorithm. For example it is possible
        to do something like this:

        wstring str1( L"abc" );
        string str2( "111abc2abc111" )

        replace_all( str2, str1, string( "xxx " ) );

(iii) I haven't seen the need to specialize for string. All algorithms for fine
        for any sequence, so why to restrict them, when it is not needed. I don't
        see any benefit from this specialization.

> >
> > I understand quite well, that in most cases the lib will be used with
> > variants of a the string,
> > however I don't like to sacrifice functionality in the favor of general
> > pattern when it is
> > not needed.
> >
>
> It is good to have generic algorithms, but there is always danger to become
> too generic. There is a pretty comprehensive set of generic algorithms to
> work with generic containers in the Standard. It would be good to have
> special algorithms for special cases, like string algorithms for string
> manipulation. But here you are trying to provide special algorithms for
> general cases, like string algorithms for generic containers.
>

Well, maybe if you can show me a specific algorithm which cannot work with
any sequence but string, I can consider it. I don't know about any.

When you use algorithm, it does not restrict the underlaying sequence. So
If you use algorithm on the string, if will not take away the string functionality.

> > As you can see, library is not just about trim function for which the
> > usage of substring is very
> > obvious. For most of others in the current implementation the usage of
> > substring is questionable.
> >
> Sure, utility classes should be used only where applicable.
>
> > There are places however where we benefit from each other. For example
> > ther was proposal for
> > functions like substr_until( const string& str, const string& substr )
> > which should give you
> > the string from beginning until the first occurenc of the substr. This
> > looks like perfect
> > contructor for your sub_subtring. Also find algorithms can be used for
> > sub_string construction
> >
> > If I thinking about joining, I have one idea. In the similar way you have
> > substring, if may be possible
> > to define generic sub_sequence class which would adapt to a sequence
> > container. I think this
> > could be quite handy and I assume that with a little bit of refactoring
> > you can make the
> > sub_string class a specialization of this generic sub_sequence. Such a
> > class could be then easily
> > integrated with string_algo.
> >
> For generic case like this I believe that pair of iterators would be pretty
> good representation for generic sub sequesnce... However I need to think
> more about it. Definaltely sub sequence will not have 80% of string
> interface implemented, which will make it practically unusable in string
> field for which string algorithms are targeted. I think this is an example
> of going too generic.
>

Two iterators are probably good represenation but I can imagine a wrapper
which would translate all calls to erase, find and etc. for such a subsequence
just like you substring does.

And again what restricts you to add all basic_string operation to a specialization
of this kind of subsequence?

>
> I have uploaded sub string to
> http://groups.yahoo.com/group/boost/files/sub_string.zip.
>
I had a look. It looks interesting, however I couldn't compile it with VC7,
I assume because of partial specialization you are using.
I have compiled it with gcc 3.2, just to see how tests are working, and
it worked just fine.

Few comments about it. It is just my opinion, but I think its quite dangerous
to redefine std::basic_string. There are many different implementation of STL
and there is no guarantie, that your implementation will work with all of them.

Also I don't think that something like this will be accepted to boost if for
nothing else then for the incorrect namespace.

IMHO it would be possible to provide the same functionality outside of std
namespace.

Well these are just my opinions and other boosters may have different.

Regards,

Pavol


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk