Boost logo

Boost :

From: Jeremy Maitin-Shepard (jbms_at_[hidden])
Date: 2007-06-19 14:30:19


Mathias Gaunard <mathias.gaunard_at_[hidden]> writes:

> Jeremy Maitin-Shepard wrote:
>> It occurs to me that perhaps it is not unreasonable after all to
>> restrict the library to supporting Unicode encodings for in-memory
>> character representation.

> I personally believe Unicode (not only the character set, but also its
> collations and algorithms) is the only viable way to represent
> characters, and thus should be the way strings work with. (get out evil
> locales and other stuff!)
> Of course, various encodings can still be used for serialization.

I agree that I personally would always want to use a Unicode encoding
for handling text in my software. The question, though, is whether the
new I/O library should actually force users to use a Unicode encoding
for internal text representation. Even if other internal encodings are
supported, Boost might still only provide actual text formatting
facilities and other high-level text facilities for all Unicode
encodings (UTF-8, UTF-16, and UTF-32) or even only a single Unicode
encoding.

> Unfortunately, C++ is quite far from having good Unicode tools (not that
> other programming languages are really better -- Unicode is simply quite
> complicated, because human languages just are)

> ICU has most of the stuff, but not with the right interfaces.

A better I/O system might provide a very solid base on top of which
proper higher level text facilities can be provided, quite possibly by
incorporating pieces of ICU.

-- 
Jeremy Maitin-Shepard

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk