Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Yakov Galka (ybungalobill_at_[hidden])
Date: 2011-01-26 06:42:14


On Wed, Jan 26, 2011 at 11:54, Matus Chochlik <chochlik_at_[hidden]> wrote:

> On Wed, Jan 26, 2011 at 10:37 AM, Yakov Galka <ybungalobill_at_[hidden]>
> wrote:
> > Excuse my ignorance, but can someone explain to me why people are so keen
> on
> > immutable strings? Aren't they basically the same as 'shared_ptr<const
> > std::string>'?
>
> I'm fairly neutral on the immutability issue, I do not oppose it if
> someone shows why it is a superior design, provided it does not
> break everything horribly (from the backward compatibility perspective).
>

Me too, but it definitely will break existing code:
string.resize(91);

[...]
> >
> > ?
> > What are those properties? Isn't std::string *is* what it should have
> been?
> > Do you mean that you want to put there in any possible algorithm you can
> > imagine?
>
> What I was talking about is basically adding some more convenience
> member functions, many of which are currently implemented by the
> string_algo Boost library, to the strings interface and more importantly
> to extend the strings interface with 'Unicode-functionality' i.e. the
> ability
> to traverse the string not just as a sequence of bytes but as a sequence
> of Unicode code-points and if possible even "logical characters".
>

My point is that 'Unicode-functionality' should be separate from the string
implementation. This code
for(char32_t cp : codepoints(my_string));
should work with any type of my_string whose encoding is known.

I'm not against adding convenience functions into the string. It makes the
code more readable when you concatenate operations. However, it violates
this:
http://www.drdobbs.com/184401197

> >
> > IMO std::string is just a container of bytes with two useful convenience
> > methods (c_str() and substr()) and a utf8 encoding that had to be assumed
> by
> > default but unfortunately isn't. Everything else should be generic
> > algorithms that work with sequences of characters in some encoding. So,
> > maybe it's better to focus on designing something like
> boost::iterator_range
> > with an encoding associated with it and algorithms that work with these
> > ranges?
> I that is to succeed it has to be (backward)compatible with the existing
> APIs,
> however borked they seem to us (me included). There are lots of strings
> implementations that are *cool* but unusable by anything except algorithms
> specifically designed for them.
>
> I can't exactly understand what has to be backward compatible with what...
Can you please provide a few code snippets that mustn't break so I could
think about that?

-- 
Yakov

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk