Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2011-01-28 03:16:40


On Fri, Jan 28, 2011 at 2:04 AM, David Bergman
<David.Bergman_at_[hidden]> wrote:
> On Jan 27, 2011, at 10:04 AM, Dean Michael Berris wrote:
>
>> On Thu, Jan 27, 2011 at 10:55 PM, Stewart, Robert
>> <Robert.Stewart_at_[hidden]> wrote:
>
> [snip]
>
>>> That's short, but not descriptive.  The "i" prefix is more suggestive of "interface" than "immutable" to me.  Why not just go whole hog and call it "immutable_string" as Artyom suggested?
>>>
>>
>> The only objection really is that it's too long. :D Less characters is better.
>>
>> /me gets a thesaurus and looks up string :D
>
> Ok, but why this focus on immutability? Is that not a quite orthogonal concern to the encoding problematics discussed here (as well...)?
>

Two reasons why focus on immutability.

First is that it deals with the underlying storage. This has to be
"fool-proof" to avoid the problems of a mutable data structure. Unless
you're certain that at any given point after the string is constructed
that it will not change, then you can throw a lot of the potential
optimizations at the algorithms and lazy transformations that can make
certain operations a lot more efficient.

Second is because encoding is largely a matter of interpretation
rather than of actual transformation. What I mean by this is that an
encoding is supposed to be a logical transformation rather than an
actual physical transformation of data (although it almost always
manifested as such) -- and it doesn't have to be an immediately
applied algorithm either. So without the immutable guarantee from the
underlying data type, you can't make "clever" re-arrangements at the
algorithm implementation that can assume immutability -- things like
caching data would not need to be done since the data wouldn't ever
change, that copying data would be made largely unnecessary, and
things like that.

> I would prefer to have this discussion be about the encoding aspect(s) rather than immutability, unless the latter somehow intrinsically enable a much more improved handling (and preferably at the interface level) of various encoding, and I seriously doubt that.
>

Sure, and there are already algorithms that implement encodings that
deal with ranges. They've always been there before. What's being
talked about here is whether a string would have the encoding as an
intrinsic property of a string -- and I maintain that the answer to
that question (at least from my POV).

> So, if we keep this discussion at that of a mutable sequence of characters, according to some encoding(s), I would be less grumpy.
>

So what's wrong with using ICU and the work that others have done with
encoding already?

Am I the only person seeing the problem with strings being mutable? (I
honestly really want to know).

-- 
Dean Michael Berris
about.me/deanberris

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk