Boost logo

Boost :

From: Beman Dawes (bdawes_at_[hidden])
Date: 2001-08-16 20:06:21


At 05:06 PM 8/16/2001, Jens Maurer wrote:

>Darin Adler wrote:
>> A second cut at the library is in
>> <http://groups.yahoo.com/group/boost/files/string_algorithm/> now.
>
>Looks good to me as a first cut.
>There will be many small functions in the future. Therefore,
>when reviewing this, we may want to define a process how we can
>add more algorithms relatively quickly and painlessly without feeling
>the need for a re-review. So we actually review the process and
>what's currently implemented as a design example.

Yes, I very much agree with that view. What I've suggested is that we
defined criteria for future inclusions, and then appoint, say, two or three
library maintainers (or maybe call them Library Managers) who we feel
understand the criteria well. Then they get to decide on acceptance of any
new function. That avoids endless re-reviews, yet still conforms to the
spirit of Boost, I think.

>
>> type replacement_format;
>
>What's that?
>
>> Here are the names for some of Mike's additional proposed algorithms:
>>
>> size_type count(string, pattern);
>
>Useful. (Possibly needs a definition that only non-overlapping
>"pattern"s are counted in "string".)
>
>> void trim(string& [, characters to trim]);
>> void trim_beginning(string& [, characters to trim]);
>> void trim_end(string& [, characters to trim]);
>
>I haven't felt an urgent need for these, yet. Usage examples?

I have my own versions of these and use them all the time. A quick grep of
the application shows uses in two cases:

    * Fixed length fields (read from input records) are trimmed to get rid
of the padding.

    * Variable length input records from various sources are trimmed
because experience has shown spurious leading or trailing spaces to be a
commonplace.

>Also, trim* can be emulated by replace_first() or replace_all()
>with a regular expression. Furthermore, I wouldn't put in
>a set of "default" characters to trim, because it will often
>be dependent on a lot of issues. Better let the programmer
>decide consciously each time the function is called.
>
>> vector<string> split(string [, delimiter characters]);
>
>Someone else already asked for an OutputIterator interface.
>I second that (the added flexibility regarding the container
>is worth the additional source code line at the caller).
>
>> string join(vector<string> [, string separator]);
>> string join(start iterator, end iterator [, string separator]);
>
>The "iterator" variant is sufficient.
>
>> Mike also proposed using all the above names for the "modify and return
>> result" variants, and using <xxx>_in, as the name for the "modify in
>place"
>> variants.
>
>I understand that there are a few functions where you have a design
>choice between modify-in-place (pass by reference) and "return
modification
>result". The only candidates in the above list are the trim* functions.
>I'd like to see usage examples what interface is more convenient to
>the programmer for this particular function. At the moment, I favour
>modify-in-place.
>
>In general, there should be only *one* variant, and not several with
>or without _in suffix. Keep it simple.

In general, that makes sense. It shouldn't be applied to literally, as
experience may show multiple variants of some function are needed.

--Beman


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk