Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-01-20 09:46:55

Next message: Chad Nelson: "Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]"
Previous message: Chad Nelson: "Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]"
In reply to: Chad Nelson: "Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]"
Next in thread: Alexander Lamaison: "Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]"

> >
> > OK, if the long term plan is:
> >
> > 1) design and implement boost::string using UTF-8 doing all the things
> > like code-point iteration, character iteration, convenience stuff like
> > starts-with, ends-with, replace, trim, etc., etc. with as much
> > backward compatibility with std::string as possible without hindering
> > progress
> >
> > 2) try really hard to push it to the standard
> >
> > then I'm on board with that.
>
> Some of those could be problematic (I've run across references implying
> that 0x20 isn't the universal word-separation character, so trim would
> at least need some extra parameters), but for the most part, I'd agree
> with it.

And also it is locale dependent.

Unicode defines 4 text segments: Grapheme, Word and Sentence.

http://www.unicode.org/reports/tr14/

There is also line break boundaries defined:

http://unicode.org/reports/tr29

Most of them are also locale dependent as require use of
dictionaries.

So unless you want to carry locale information in the string,
I don't think it is good to put these into the string itself.

Artyom

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk