Boost logo

Boost :

Subject: Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-01-20 09:46:55

> >
> > OK, if the long term plan is:
> >
> > 1) design and implement boost::string using UTF-8 doing all the things
> > like code-point iteration, character iteration, convenience stuff like
> > starts-with, ends-with, replace, trim, etc., etc. with as much
> > backward compatibility with std::string as possible without hindering
> > progress
> >
> > 2) try really hard to push it to the standard
> >
> > then I'm on board with that.
> Some of those could be problematic (I've run across references implying
> that 0x20 isn't the universal word-separation character, so trim would
> at least need some extra parameters), but for the most part, I'd agree
> with it.

And also it is locale dependent.

Unicode defines 4 text segments: Grapheme, Word and Sentence.

There is also line break boundaries defined:

Most of them are also locale dependent as require use of

So unless you want to carry locale information in the string,
I don't think it is good to put these into the string itself.



Boost list run by bdawes at, gregod at, cpdaniel at, john at