|
Boost : |
Subject: Re: [boost] [string] proposal
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2011-01-28 03:06:39
On Fri, Jan 28, 2011 at 1:43 AM, David Bergman
<David.Bergman_at_[hidden]> wrote:
> On Jan 27, 2011, at 5:52 AM, Dean Michael Berris wrote:
>
>>
>> Sorry, but for someone who's dealt with std::string for *a long time
>> (close to 8 years)* here's are a few real painful problems with it:
>
> I actually agree with you here, DMB (which are also my initials...), although I am still in shock of this "new IT world" where 8 years is a long time :-) Some old farts here (me included) have dealt with std::string for 20+ years, and, yes, we share your pain :-)
>
:D (I think I should have said "long *enough* time" :D)
> Artyom is a bit silly here, since "having to know the encoded length of each character and then use some calculus to get to the index to use when setting a character" is indeed a problem for a client of a class - one does want that intelligence to be embedded in a proper string class. Yes, there are cases for UTF-8, specifically, where we can just watch for a certain byte coming up, but that is not a stable pattern and applies only to certain operations, such as his searching for whitespace.
>
> Additionally, not only having dealt with std::string for more than 20 years but also with Java, C#, Ruby etc. [sS]trings for the last 10-15 years, it is quite clear that not having a string class in the standard library that can at least handle UTF-8 properly is a wart and an embarrassment when trying to lure "soft programmers" into C++. One has to come up with defenses like those of Artyom's to salvage the situation...
>
> So, I applaud your fight here, DMB, seriously. I just happen to disagree with your specific focus.
>
> What I would do would be to focus not on a class per se but on the (GP...) concept and the associated iterators needed. Then we can see if we can also produce a proper model for that, or if we can find a sub concept (even though it would only be "sub" specification-wise but actually "super" set-wise...) that would make the current std::string a model.
>
Right, I agree. But the context of the discussion was simply around the class.
I have in my mind an almost-thoroughly though out picture of how
strings can be dealt with from an algorithm perspective. It's complete
to the point of just needing to be written down as a document that
everybody can chime in on and maybe review for comment.
I did introduce the concept a while back when I asked for a different
string container that had its own semantics different from the way
std::string does it -- I think I called it a string_handle and that
name didn't stick nor did it evoke enough responses to merit a
(bikeshed?) discussion like this one has. There I proposed a:
1. Simple Expression-Templates based interface for concatenation and
performing lazy transformations.
2. Range-based algorithms that may be aware of certain encodings (not
just Unicode).
3. A foundation from which a family of string algorithms can be built around.
Having done more GP myself recently, more and more I think the forces
that shape the abstractions and the algorithms should be "equal" to
come up with any reasonably effective solution. Much of this though
revolves around what you really want to be able to achieve with the
family of algorithms and the abstractions you want to model.
> And, I would drop the "immutable" part since this thread was supposed to be about a new thingie that could potentially replace std::string in C++3x..
>
Unfortunately that's a deal breaker for me. I see a world where
immutable strings would be the real solution to a lot of the issues
we're facing. And I would think if I (and others who believe) can make
it happen, then why shouldn't it be a replacement to std::string in
C++4x? :D
Thanks David for the encouragement and feedback.
/me hunkers down and writes the document.
-- Dean Michael Berris about.me/deanberris
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk