Boost :

Date view	Thread view	Subject view	Author view

From: Thorsten Ottosen (nesotto_at_[hidden])
Date: 2003-10-26 08:19:51

Next message: Thorsten Ottosen: "[boost] Re: [string algo formal review]"
Previous message: Paul A. Bristow: "RE: [boost] Yet Another Units Library"
In reply to: Pavol Droba: "Re: [boost] [string algo formal review]"
Next in thread: Pavol Droba: "Re: [boost] Re: [string algo formal review]"
Reply: Pavol Droba: "Re: [boost] Re: [string algo formal review]"

"Pavol Droba" <droba_at_[hidden]> wrote in message
news:20031026095828.GB17011_at_lenin.felcer.sk...
> On Sun, Oct 26, 2003 at 04:54:34PM +1100, Thorsten Ottosen wrote:
[snip]
> > I get a crash when I try to call to_lower( char* ); The debugger stops
> > inside std::transform(). Should we not be able to do this?
>
> This should work, could you please send me an example? I might be
possible,
> that you have used a pointer fro write protected segment. But this is just
my guess.

you were right. I should have used a char array as the type of my variable.

> > Consider algorithms with 'nth', eg.:
> >
> > ierase_nth_copy( s, "s", 1 );
> >
> > this does not erase the 1st copy of "s", but the second. What's the
> > rationale for this? I don't find it intuitive.
>
> The index is starts from zero. It is explicitly specified in the docs.
> Rationale for it is taht all c-array start with 0 and for cycles it is
> usual practice to go from 0.

where in the docs? The detailed explanation says:

"Search for an n-th match of search sequence in the input sequence."

which means when I specify 0, I search for the 0-th occurrence of something.
When we talk about arrays and containers it is understood by everyone that 0
is the first index. When people talk about words/strings, it does not make
sense anymore; there is no 0th index; there's the first occurence and the
second etc. My point that it is two very different concept and practice in
one cannot be used to dictate practice in the other.

> > I seem to miss predicate taking functions for eg. erase() and replace().
I
> > would like to say
> >
> > erase_all_if( s, is_lower() );
>
> It is possible to add this to interface. Curently it is possible to do it
like this
>
> namespace sa=boost::string_algo;
> sa::replace_all( s, sa::token_finder(is_lower()), sa::empty_formater(s) );

quite verbose. I think that shows that those functions might be worth
having. On the other hand, they might
not be used frequently enough to warrant an inclusion.

> It is not a problem the add this to 1-layer interface. Question would be,
what would
> you await from replace_all_if?

I would simply expect that I use a predicate to identify a match.

> I think it has been proposed that the returned object should
> > support implicit conversion to bool to allow tests in if-statements and
I
> > support that idea.
[snip]
> About the conversion to safebool, in my personal opinion, it is not
explicit
> enough. If th result is an empty range, it is obvious, that operation did
not
> found anything.
> However, because iterator_range is not used only by find algorithms,
> cast to bool can be misleading. After all, it is a kind of container.

I think the right wording would be that "it supports a subset of the
container interface". It's not
a container AFAICT. It's a view into an existing container.

what's misleading about

if( find_first( s, "blabla" ) )

instead of

if( !find_first( s. "blabla" ).empty() )

> > Consider find_head( s, 2 ); What are we finding here? AFAICT it a
synonym
> > for make_range( s.begin(), s.begin()+ 2 ). I think, therefore, that this
> > algorithm is not needed.
>
> Problem is when your string is not long enough. So would need to check
> all boundary conditions. _head does it for you. Because this operation
> is quite common, it is included int he library.

ok, I did not see that. Maybe that could be explained in the docs?

> > The iterator_range class. Maybe it should be called sub_string to put
the
> > focus more on string algorithms instead of generic algortihms. The
border
> > between string and non-string algorithms seems to be fuzzy. Implicit
> > conversion to basic_string<T> ??
>
> Whole library is designed in the way that it is not restricted to
std::basic_string.
> Therefor such a convention would break the rule.

chances are that basic_string will be used the most. Anyway, no matter what
string you use, it would be nice if one did not have to covert explicitly,
eg:

string substring = find_XX( s, YY );
cout << find_XX( s, YY );

instead of

iterator_range< string::iterator> temp = find_XX( s, YY );
string substring( temp.begin(), temp.end() );
cout << string( temp.begin(), temp.end() );

I find that too complicated to do really simple and frequent stuff. That's
why I would like to
see substring concept.

> If you need to convert an iterato_range ot a particular container, you can
use
> functinon copy_range ( f.e. s = copy_range<string>( range ); ).

not much better IMO.

> > In the section "First example" under the "Naming" bullet it says that
the
> > naming follows the cnovetions used in the standard library.
Novertheless,
> > predicate versions does not use the suffix _if. I think it would be
good to
> > follow the STL convention and change the names accordingly.
>
> As far as I know, this only applies to trim functions. It is possible to
rename them,
> if it bring some benfits (like clarity).

clearity is one thing; STL naming compliance must be the other.

> > Regarding the iterator_range class, then I think it might make the
> > documentation a little harder to grasp. Something as general as this
could
> > definitely be useful on its own (although I would prefer a shorter name,
> > like subrange). The basic usage for iterator_range in the string library
is
> > as a sub-string, and therefor it might be more useful with a substring
class
> > which provided implicit conversion to basic_string. And the library
could
> > then provide two typedefs
>
> Again, I strongly disagree with adding some unnecessary affinity of the
lib to the
> basic_string. It goes fundamentaly againts the basic design principles of
this library.
>
> Library defintion of a string is not the "basic_string" rather "an
arbitrary sequence
> of characters" and it is designed on this principle.

I know, but its also a principle that it should be easy an intuitive to use.
AFAICT, a substring
should be almost an iterator_range, but with

1) implicit conversion to safe-bool
2) implicit conversion to basic_string<Ch>

this would make it somewhat more convevient to use. As it is now, the
library might even work with
containers of arbitrary types which makes it much more than a string
library; true, it more general, but IMO
less useful.

> > Locales. Is there any reason why a locale need to be a parameter to each
> > function? If one want's to use a specific locale in one function call,
it
> > seems likely that the same locale would be used throught the entire
program.
> > I'm just wondering if it wouldn't be sufficient with some kind of
set_locale
> > function which then affected all the algortihms?
>
> If your program uses just one locales at the time, you are perfectly ok
with
> default values and you can use std::locale::global to set your prefered
locale
> settings.
>
> However there are applications when there is a need to used more locales
simultaneosly.

do you mean in two different threads? Otherwise I don't get it.

regards

-Thorsten

Next message: Thorsten Ottosen: "[boost] Re: [string algo formal review]"
Previous message: Paul A. Bristow: "RE: [boost] Yet Another Units Library"
In reply to: Pavol Droba: "Re: [boost] [string algo formal review]"
Next in thread: Pavol Droba: "Re: [boost] Re: [string algo formal review]"
Reply: Pavol Droba: "Re: [boost] Re: [string algo formal review]"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk