Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2011-01-28 03:16:40

On Fri, Jan 28, 2011 at 2:04 AM, David Bergman
<David.Bergman_at_[hidden]> wrote:
> On Jan 27, 2011, at 10:04 AM, Dean Michael Berris wrote:
>> On Thu, Jan 27, 2011 at 10:55 PM, Stewart, Robert
>> <Robert.Stewart_at_[hidden]> wrote:
> [snip]
>>> That's short, but not descriptive.  The "i" prefix is more suggestive of "interface" than "immutable" to me.  Why not just go whole hog and call it "immutable_string" as Artyom suggested?
>> The only objection really is that it's too long. :D Less characters is better.
>> /me gets a thesaurus and looks up string :D
> Ok, but why this focus on immutability? Is that not a quite orthogonal concern to the encoding problematics discussed here (as well...)?

Two reasons why focus on immutability.

First is that it deals with the underlying storage. This has to be
"fool-proof" to avoid the problems of a mutable data structure. Unless
you're certain that at any given point after the string is constructed
that it will not change, then you can throw a lot of the potential
optimizations at the algorithms and lazy transformations that can make
certain operations a lot more efficient.

Second is because encoding is largely a matter of interpretation
rather than of actual transformation. What I mean by this is that an
encoding is supposed to be a logical transformation rather than an
actual physical transformation of data (although it almost always
manifested as such) -- and it doesn't have to be an immediately
applied algorithm either. So without the immutable guarantee from the
underlying data type, you can't make "clever" re-arrangements at the
algorithm implementation that can assume immutability -- things like
caching data would not need to be done since the data wouldn't ever
change, that copying data would be made largely unnecessary, and
things like that.

> I would prefer to have this discussion be about the encoding aspect(s) rather than immutability, unless the latter somehow intrinsically enable a much more improved handling (and preferably at the interface level) of various encoding, and I seriously doubt that.

Sure, and there are already algorithms that implement encodings that
deal with ranges. They've always been there before. What's being
talked about here is whether a string would have the encoding as an
intrinsic property of a string -- and I maintain that the answer to
that question (at least from my POV).

> So, if we keep this discussion at that of a mutable sequence of characters, according to some encoding(s), I would be less grumpy.

So what's wrong with using ICU and the work that others have done with
encoding already?

Am I the only person seeing the problem with strings being mutable? (I
honestly really want to know).

Dean Michael Berris

Boost list run by bdawes at, gregod at, cpdaniel at, john at