Boost logo

Boost :

From: Edward Diener (eddielee_at_[hidden])
Date: 2006-07-03 08:04:12


Jeff Garland wrote:
> David Abrahams wrote:
>> Jeff Garland <jeff_at_[hidden]> writes:
>>
>>> I've been working on a little project where I've had to doing lots of string
>>> processing, so I decided to put together a string type that wraps up
>>> boost.regex and boost.string_algo into a string type. I also remember a
>>> discussion in the LWG about whether the various string algorithms should be
>>> built in or not -- well consider this a test -- personally I find it easier
>>> built into the string than as standalone functions.
>> I appreciate the convenience of such an interface, I really do, but
>> doesn't this design just compound the "fat interface" problems that
>> std::string already has?
>
> Yes, that's partially the point :-) I understand std::string is too big for
> some. Sadly the members it has make it hard to do the things I tend to do with
> strings most often with strings. The fact is, if you look around at the
> languages people are using most for string processing, they offer just as many
> features as super_string and then some. Somehow, programmers are managing to
> deal with this. I'd buy more into the fat interface being a problem if
> something in the string class went beyond string processing, but it doesn't.
> String processing is a big complex domain -- whole languages have been
> optimized for it -- it needs a lot of functions to cover the domain and make
> easy to read code. Any way you slice it the current basic_string is inferior
> to what most modern languages offer.
>
> Needless to say, I understand all about stl, free functions, their power, etc,
> etc. But the big thing this misses is that having a single type that unifies
> the string processing interface means there's a single set of documentation to
> start figuring out how to do a string manipulation.

I am with you, Jeff. I do not think that std::string is too fat but only
that a design mistake was made with it. The mistake is that after
specifying a std::string constructor that takes a C null-terminated
string ( const char *), which std::string has, all other functionality
dealing with a string should have been in terms of std::string, and
nothing else should have been in terms of a C null-terminated string.
This is the principle of making a clear interface which has a single
good way of doing things, rather than a muddy interface with numerous
ways of doing the same thing. Other than this design mistake, no doubt
unfortunately done to cater to the C crowd, std::string is fine for what
it does and is not too fat at all.

> I don't have to wade thru
> 50 pages of string_algorithms, 50 pages of regex docs and so on -- there's
> hundreds of functions to deal with strings there. Not to mention the
> templatization factor in the docs of these libraries which mostly detracts
> from me figuring out how to process the string. If I'm a Boost novice much of
> this great a useful string processing capability might be lost in so many
> other libraries.
>
> The other thing that gets me is the readability of code. With a built-in
> function, it's one less parameter to remember when calling these functions.
> It seems trivial, but I believe the code is ultimately easier to understand.
> Simple example:
>
> std::string s1("foo");
> std::string s2("bar);
> std::string s3("foo");
> //The next line makes me go read the docs again, every time
> replace_all(s1,s2,s3); //which string is modified exactly?
> or
> s1.replace_all(s2, s3); //obvious which string is modified here
>
> I understand this flies against the current established C++ wisdom, but that's
> part of the reason I've done it. After thinking about it, I think the
> 'wisdom' is wrong. Usability and readability has been lost -- my code is
> harder to understand. I expect that super_string has little chance of ever
> making it to Boost because it is goes too radically against some of these
> deeply held beliefs. That said, I think there's a group of folks out there
> that agree with me and are afraid to speak up.

I will speak up. The passion for loosely coupled free functions has gone
too far. It works when there is a reason for it, usually because it is a
function template and must deal with different types, ala the algorithms
in the C++ standard library, but is not a solution for all situations. I
am for a rich string class and think that super string is the right
idea. My only difference is that I want a string class to only deal in
C++ std::strings at all times, once a constructor has been provided for
converting a null-terminated C string into the string class, in order to
make the interface much cleaner and clearer.

> Now they can at least download
> it from the vault -- but maybe they'll speak up -- we'll see. In any case,
> it's up to individuals to decide download and use super_string, or continue
> using their inferior string class ;-)
>
>> Even Python's string, which has a *lot* built in, doesn't try to
>> handle the regex stuff directly.
>
> There are plenty of counter examples: Perl, Java, Javascript, and Ruby that
> build regex directly into the library/language. It's very powerful and useful
> in my experience. And, of course, super_string doesn't take away anything,
> just makes these powerful tools more accessible and easier to use.
>
> Jeff


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk