Boost logo

Boost :

From: Jeff Garland (jeff_at_[hidden])
Date: 2006-07-02 14:51:32


David Abrahams wrote:
> Jeff Garland <jeff_at_[hidden]> writes:
>
>> I've been working on a little project where I've had to doing lots of string
>> processing, so I decided to put together a string type that wraps up
>> boost.regex and boost.string_algo into a string type. I also remember a
>> discussion in the LWG about whether the various string algorithms should be
>> built in or not -- well consider this a test -- personally I find it easier
>> built into the string than as standalone functions.
>
> I appreciate the convenience of such an interface, I really do, but
> doesn't this design just compound the "fat interface" problems that
> std::string already has?

Yes, that's partially the point :-) I understand std::string is too big for
some. Sadly the members it has make it hard to do the things I tend to do with
strings most often with strings. The fact is, if you look around at the
languages people are using most for string processing, they offer just as many
features as super_string and then some. Somehow, programmers are managing to
deal with this. I'd buy more into the fat interface being a problem if
something in the string class went beyond string processing, but it doesn't.
String processing is a big complex domain -- whole languages have been
optimized for it -- it needs a lot of functions to cover the domain and make
easy to read code. Any way you slice it the current basic_string is inferior
to what most modern languages offer.

Needless to say, I understand all about stl, free functions, their power, etc,
etc. But the big thing this misses is that having a single type that unifies
the string processing interface means there's a single set of documentation to
start figuring out how to do a string manipulation. I don't have to wade thru
50 pages of string_algorithms, 50 pages of regex docs and so on -- there's
hundreds of functions to deal with strings there. Not to mention the
templatization factor in the docs of these libraries which mostly detracts
from me figuring out how to process the string. If I'm a Boost novice much of
this great a useful string processing capability might be lost in so many
other libraries.

The other thing that gets me is the readability of code. With a built-in
function, it's one less parameter to remember when calling these functions.
It seems trivial, but I believe the code is ultimately easier to understand.
Simple example:

    std::string s1("foo");
    std::string s2("bar);
    std::string s3("foo");
    //The next line makes me go read the docs again, every time
    replace_all(s1,s2,s3); //which string is modified exactly?
or
    s1.replace_all(s2, s3); //obvious which string is modified here

I understand this flies against the current established C++ wisdom, but that's
part of the reason I've done it. After thinking about it, I think the
'wisdom' is wrong. Usability and readability has been lost -- my code is
harder to understand. I expect that super_string has little chance of ever
making it to Boost because it is goes too radically against some of these
deeply held beliefs. That said, I think there's a group of folks out there
that agree with me and are afraid to speak up. Now they can at least download
it from the vault -- but maybe they'll speak up -- we'll see. In any case,
it's up to individuals to decide download and use super_string, or continue
using their inferior string class ;-)

> Even Python's string, which has a *lot* built in, doesn't try to
> handle the regex stuff directly.

There are plenty of counter examples: Perl, Java, Javascript, and Ruby that
build regex directly into the library/language. It's very powerful and useful
in my experience. And, of course, super_string doesn't take away anything,
just makes these powerful tools more accessible and easier to use.

Jeff


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk