Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Dave Abrahams (dave_at_[hidden])
Date: 2011-01-21 11:55:12


At Fri, 21 Jan 2011 20:07:51 +0800,
Dean Michael Berris wrote:
>
> On Fri, Jan 21, 2011 at 7:25 PM, Matus Chochlik <chochlik_at_[hidden]> wrote:
> > Dear list,
> >
> > following the whole string encoding discussion I would like
> > to make some suggestions.
> >
> [snip]
> >
> > Let us create a class called boost::string that will have
> > all the properties that a string handling class in 2011+ A.D.
> > should have, basically what std::string should have been.
> >
>
> +1
>
> [snip]
> >
> > Also I've uploaded into the vault file string_proposal.zip
> > containing my (naive and un-expert-ly) idea what the
> > interface for boost::string and the related-classes could
> > look like (it still needs some work and it is completelly
> > un-optimized, un-beautified, etc.).
> >
> > /me ducks and covers :)
> >
>
> Maybe you have a publicly available Git repository -- maybe on Github
> -- we'd have a better discussion going?
>
> Mostly I'm interested in seeing a string class that is:
>
> 1. Immutable. No if's or but's about it. I don't want a string to be
> modifiable. Period. You can create it, and once it's created, that's
> it.

Do you want to prevent
1. wholesale mutation such as

          x = y
          x += y

or just

2. per-char mutation such as

          x[10] = 'a'

?

eliminating #2 does a lot for implementation flexibility
(e.g. allowing refcounts or GC to be used cleanly), and can be useful
for thread safety if there's no "small string optimization," because
the buffers holding the chars are truly immutable.

However, preventing #1 is a more serious matter...

> 2. Has real value semantics. This means, once you've copied it, that's
> really copied. No funky copy-on-write reference-counting
> mumbo-jumbo.

I guess you're talking about just per-char mutation, then, because
value semantics implies assignability.

> 3. Has all the algorithms that apply to it defined externally.
>
> 4. Looks like a real STL container except the iterator type is smarter
> than your average iterator.
>
> Encoding is a matter of external interpretation and I think should not
> be part of a string's interface. You can have wrappers that interpret
> a string as a UTF-* string.

What does it iterate over? chars? code points? characters?
Something else?

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk