|
Boost : |
Subject: Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]
From: Patrick Horgan (phorgan1_at_[hidden])
Date: 2011-01-20 03:05:47
On 01/19/2011 11:54 AM, Chad Nelson wrote:
> On Wed, 19 Jan 2011 09:58:13 -0500
> Edward Diener<eldiener_at_[hidden]> wrote:
>
>>> I am a believer ;) and when people realize that UTF-8 is the way to
>>> go, the pesky problems will vanish. Believe me today with ANSI
>> I do not believe that UTF-8 is the way to go. In fact I know it is
>> not, except perhaps for the very near future for some programmers
>> ( Linux advocates ).
>>
>> Inevitably a Unicode standard will be adapted where every character
>> of every language will be represented by a single fixed length number
>> of bits. [...]
> I'm no Unicode expert, but the reason this hasn't happened might be
> combinatorial explosion. In which case it might never happen. But I
> could well be wrong. And I hope I am, the design you outline is
> something I'd love to see.
It's already here and has been for a long time. That's just UCS encoded
as UTF-32. UCS isn't a new thing. They started on the standard in the
late 80s and the standard was first copyright in 1991. They've come a
long way. All the common languages and many of the uncommon languages
are supported. Already many dead languages are supported. Language
with supported added in 5.1 and 5.2 were Cham, Kayah Li, Lepcha, Ol
Chiki, Rejang, Saurashtra, Sundanese, Vai, Bamum, Javanese, Lisu, Meetei
Mayek, Samaritan, Tai Tham, and Tai Viet.
Patrick
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk