Boost logo

Boost :

From: Robert Bell (belvis_at_[hidden])
Date: 2004-04-06 13:22:45


Ferdinand Prantl wrote:
> Hello,
>
>
>>From: Vladimir Prus [mailto:ghost_at_[hidden]]
>>
>>
>>>glib did a very good
>>>implementation of UTF-8 handling and Glibmm is a well done
>>
>>C++ wrapper
>>
>>>but it lacks the "standardness". Something like
>>
>>boost::ustring COULD
>>
>>>bring a widely accepted UTF-8 aware unicode string to C++
>>
>>programmers.
>>
>>>A somewhat relieving thought.
>>
>>I am not exactly sure if UTF-8 or UCS-4 is better as
>>universal solution, but some solution is surely needed.
>
>
> I am afraid there is no universal solution for all users. The easiest
> solution is based on the native basic_string<>, which is specialized for
> char (8-bit) to support ASCII/ANSI encodings and for wchar_t (16-bit)
> usually used for UCS-2 encoded strings. UCS-4 (32-bit) encoding would
> require another basic_string<> specialization.
>
> UCS-2 held all characters in Unicode 1.1, There was a need for more unique
> numbers and UCS-4 was introduced in Unicode 2.0. Unfortunately there is no
> 4-byte character specialization for basic_string<> in STL yet.

Technically, there isn't a 2-byte specialization either; wchar_t might
not be 16 bits.

Bob


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk