Boost logo

Boost :

From: David Abrahams (dave_at_[hidden])
Date: 2006-07-04 17:11:13


Jeff Garland <jeff_at_[hidden]> writes:

> David Abrahams wrote:
>> Jeff Garland <jeff_at_[hidden]> writes:
>>
>>> I've been working on a little project where I've had to doing lots of string
>>> processing, so I decided to put together a string type that wraps up
>>> boost.regex and boost.string_algo into a string type. I also remember a
>>> discussion in the LWG about whether the various string algorithms should be
>>> built in or not -- well consider this a test -- personally I find it easier
>>> built into the string than as standalone functions.
>>
>> I appreciate the convenience of such an interface, I really do, but
>> doesn't this design just compound the "fat interface" problems that
>> std::string already has?
>
> Yes, that's partially the point :-) I understand std::string is too big for
> some. Sadly the members it has make it hard to do the things I tend to do with
> strings most often with strings.

Agreed.

> The fact is, if you look around at the languages people are using
> most for string processing, they offer just as many features as
> super_string and then some.

Try "python -c help(str)"

I think python may hit a sweet spot for where to push functionality
outside of member functions.

> Somehow, programmers are managing to deal with this. I'd buy more
> into the fat interface being a problem if something in the string
> class went beyond string processing, but it doesn't.

Iterators do :)

> String processing is a big complex domain -- whole languages have
> been optimized for it -- it needs a lot of functions to cover the
> domain and make easy to read code. Any way you slice it the current
> basic_string is inferior to what most modern languages offer.

Yes, in part because there's so much interface redundancy.

> Needless to say, I understand all about stl, free functions, their power, etc,
> etc. But the big thing this misses is that having a single type that unifies
> the string processing interface means there's a single set of documentation to
> start figuring out how to do a string manipulation. I don't have to wade thru
> 50 pages of string_algorithms, 50 pages of regex docs and so on -- there's
> hundreds of functions to deal with strings there. Not to mention the
> templatization factor in the docs of these libraries which mostly detracts
> from me figuring out how to process the string. If I'm a Boost novice much of
> this great a useful string processing capability might be lost in so many
> other libraries.

I agree that dealing with CharT and traits over and over again is
handily beaten by having it encoded, once, into the string type.

> The other thing that gets me is the readability of code. With a built-in
> function, it's one less parameter to remember when calling these functions.
> It seems trivial, but I believe the code is ultimately easier to understand.
> Simple example:
>
> std::string s1("foo");
> std::string s2("bar);
> std::string s3("foo");
> //The next line makes me go read the docs again, every time
> replace_all(s1,s2,s3); //which string is modified exactly?
> or
> s1.replace_all(s2, s3); //obvious which string is modified here

Yes, I used to make that argument regularly and I still agree with it.

> I understand this flies against the current established C++ wisdom,
> but that's part of the reason I've done it. After thinking about
> it, I think the 'wisdom' is wrong. Usability and readability has
> been lost -- my code is harder to understand. I expect that
> super_string has little chance of ever making it to Boost because it
> is goes too radically against some of these deeply held beliefs.
> That said, I think there's a group of folks out there that agree
> with me and are afraid to speak up.

Well, let's not make this political before we have to, OK? :)

> Now they can at least download it from the vault -- but maybe
> they'll speak up -- we'll see. In any case, it's up to individuals
> to decide download and use super_string, or continue using their
> inferior string class ;-)
>
>> Even Python's string, which has a *lot* built in, doesn't try to
>> handle the regex stuff directly.
>
> There are plenty of counter examples: Perl,

Not sure that's a good example if you're going for readability; Plus,
it has special operators that help (and could in principle be
implemented as free functions).

> Java, Javascript, and Ruby that build regex directly into the
> library/language.

Whoa there. Python builds regex directly into the library too. That
doesn't mean it should be part of the string.

        python -c "import sre;help(sre)"

> It's very powerful and useful in my experience. And, of course,
> super_string doesn't take away anything, just makes these powerful
> tools more accessible and easier to use.

I agree with the idea in principle; I just want to scrutinize its
execution a bit before we all buy into it as proposed ;-)

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk