From: Jeremy Maitin-Shepard (jbms_at_[hidden])
Date: 2007-06-19 14:30:19
Mathias Gaunard <mathias.gaunard_at_[hidden]> writes:
> Jeremy Maitin-Shepard wrote:
>> It occurs to me that perhaps it is not unreasonable after all to
>> restrict the library to supporting Unicode encodings for in-memory
>> character representation.
> I personally believe Unicode (not only the character set, but also its
> collations and algorithms) is the only viable way to represent
> characters, and thus should be the way strings work with. (get out evil
> locales and other stuff!)
> Of course, various encodings can still be used for serialization.
I agree that I personally would always want to use a Unicode encoding
for handling text in my software. The question, though, is whether the
new I/O library should actually force users to use a Unicode encoding
for internal text representation. Even if other internal encodings are
supported, Boost might still only provide actual text formatting
facilities and other high-level text facilities for all Unicode
encodings (UTF-8, UTF-16, and UTF-32) or even only a single Unicode
> Unfortunately, C++ is quite far from having good Unicode tools (not that
> other programming languages are really better -- Unicode is simply quite
> complicated, because human languages just are)
> ICU has most of the stuff, but not with the right interfaces.
A better I/O system might provide a very solid base on top of which
proper higher level text facilities can be provided, quite possibly by
incorporating pieces of ICU.
-- Jeremy Maitin-Shepard
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk