Boost logo

Boost :

From: Pavol Droba (droba_at_[hidden])
Date: 2003-10-26 09:19:56


On Mon, Oct 27, 2003 at 12:19:51AM +1100, Thorsten Ottosen wrote:

[snip]
 
> > > Consider algorithms with 'nth', eg.:
> > >
> > > ierase_nth_copy( s, "s", 1 );
> > >
> > > this does not erase the 1st copy of "s", but the second. What's the
> > > rationale for this? I don't find it intuitive.
> >
> > The index is starts from zero. It is explicitly specified in the docs.
> > Rationale for it is taht all c-array start with 0 and for cycles it is
> > usual practice to go from 0.
>
> where in the docs? The detailed explanation says:
>
> "Search for an n-th match of search sequence in the input sequence."
>
> which means when I specify 0, I search for the 0-th occurrence of something.
> When we talk about arrays and containers it is understood by everyone that 0
> is the first index. When people talk about words/strings, it does not make
> sense anymore; there is no 0th index; there's the first occurence and the
> second etc. My point that it is two very different concept and practice in
> one cannot be used to dictate practice in the other.
>

Then you would also need a type which holds values in range 1..n. Arrays and
containers are 0-indexed also because unsigned types are denoting exactly this
domain. And actualy what conteptual difference between "first element" and
"first match of substring"?
 
> > > I seem to miss predicate taking functions for eg. erase() and replace().
> > > I
> > > would like to say
> > >
> > > erase_all_if( s, is_lower() );
> >
> > It is possible to add this to interface. Curently it is possible to do it
> > like this
> >
> > namespace sa=boost::string_algo;
> > sa::replace_all( s, sa::token_finder(is_lower()), sa::empty_formater(s) );
>
> quite verbose. I think that shows that those functions might be worth
> having. On the other hand, they might
> not be used frequently enough to warrant an inclusion.
>
> > It is not a problem the add this to 1-layer interface. Question would be,
> > what would
> > you await from replace_all_if?
>
> I would simply expect that I use a predicate to identify a match.
>
I'm asking, because replace by design works with substrings, rather then with
single elements (sure, one element is also a substring of a kind).

replace_if would primary search for a single values, so I'm asking if
ther replacement should be a string of a value?

> > I think it has been proposed that the returned object should
> > > support implicit conversion to bool to allow tests in if-statements and
> > > I
> > > support that idea.
> [snip]
> > About the conversion to safebool, in my personal opinion, it is not
> > explicit
> > enough. If th result is an empty range, it is obvious, that operation did
> > not
> > found anything.
> > However, because iterator_range is not used only by find algorithms,
> > cast to bool can be misleading. After all, it is a kind of container.
>
> I think the right wording would be that "it supports a subset of the
> container interface". It's not
> a container AFAICT. It's a view into an existing container.
>
> what's misleading about
>
> if( find_first( s, "blabla" ) )
>
> instead of
>
> if( !find_first( s. "blabla" ).empty() )
>
> ?

I might consider this as an options.

[snip]

> > Problem is when your string is not long enough. So would need to check
> > all boundary conditions. _head does it for you. Because this operation
> > is quite common, it is included int he library.
>
> ok, I did not see that. Maybe that could be explained in the docs?

It is explicitly state what is a head :
"
Get the head of the input. Head is a prefix of a seqence of given size. If the sequence
is shorter then required, whole sequence if considered to be the head.
"

>
> > > The iterator_range class. Maybe it should be called sub_string to put
> > > the
> > > focus more on string algorithms instead of generic algortihms. The
> > > border
> > > between string and non-string algorithms seems to be fuzzy. Implicit
> > > conversion to basic_string<T> ??
> >
> > Whole library is designed in the way that it is not restricted to
> > std::basic_string.
> > Therefor such a convention would break the rule.
> >
> chances are that basic_string will be used the most. Anyway, no matter what
> string you use, it would be nice if one did not have to covert explicitly,
> eg:
>
> string substring = find_XX( s, YY );
> cout << find_XX( s, YY );
>
> instead of
>
> iterator_range< string::iterator> temp = find_XX( s, YY );
> string substring( temp.begin(), temp.end() );
> cout << string( temp.begin(), temp.end() );
>

what about

string substring=copy_range<string>( find_XX( s, YY ) );
cout << copy_range<string>( find_XX( s, YY ) );

> I find that too complicated to do really simple and frequent stuff. That's
> why I would like to
> see substring concept.

What I might considers is a kind of automatic conversion to an arbitrary container.
Something like

string str=range.copy();

"Copy" would return a stub object which will have implicit conversion to an arbitraty type.
( using enable_if, it can restricted to containers only )
 
For streams, I can provide << >> operators. This seems reasonable.

[snip]

> >
> > As far as I know, this only applies to trim functions. It is possible to
> > rename them,
> > if it bring some benfits (like clarity).
>
> clearity is one thing; STL naming compliance must be the other.

Point taken.

>
> > > Regarding the iterator_range class, then I think it might make the
> > > documentation a little harder to grasp. Something as general as this
> > > could
> > > definitely be useful on its own (although I would prefer a shorter name,
> > > like subrange). The basic usage for iterator_range in the string library
> > > is
> > > as a sub-string, and therefor it might be more useful with a substring
> > > class
> > > which provided implicit conversion to basic_string. And the library
> > > could
> > > then provide two typedefs
> >
> > Again, I strongly disagree with adding some unnecessary affinity of the
> > lib to the
> > basic_string. It goes fundamentaly againts the basic design principles of
> > this library.
> >
> > Library defintion of a string is not the "basic_string" rather "an
> > arbitrary sequence
> > of characters" and it is designed on this principle.
> >
> I know, but its also a principle that it should be easy an intuitive to use.
> AFAICT, a substring
> should be almost an iterator_range, but with
>
> 1) implicit conversion to safe-bool
> 2) implicit conversion to basic_string<Ch>
>
> this would make it somewhat more convevient to use. As it is now, the
> library might even work with
> containers of arbitrary types which makes it much more than a string
> library; true, it more general, but IMO
> less useful.

See above.

> > > Locales. Is there any reason why a locale need to be a parameter to each
> > > function? If one want's to use a specific locale in one function call,
> > > it
> > > seems likely that the same locale would be used throught the entire
> > > program.
> > > I'm just wondering if it wouldn't be sufficient with some kind of
> > > set_locale
> > > function which then affected all the algortihms?
> >
> > If your program uses just one locales at the time, you are perfectly ok
> > with
> > default values and you can use std::locale::global to set your prefered
> > locale
> > settings.
> >
> > However there are applications when there is a need to used more locales
> > simultaneosly.
> >
> do you mean in two different threads? Otherwise I don't get it.

Imagine an MDI text editor, which allows you to edit text in different languages.
Obviosly for every opend document, you need a different locales.

Regards,

Pavol


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk