Boost logo

Boost :

From: James Porter (porterj_at_[hidden])
Date: 2007-09-26 21:31:11


Actually, UTF-32 (equivalently UCS-4) *is* fixed-width (as of the
Unicode 5.0.0 standard). Page 31 of the standard (chapter 2) says:

"UTF-32 is the simplest Unicode encoding form. Each Unicode code point
is represented directly by a single 32-bit code unit. Because of this,
UTF-32 has a one-to-one relationship between encoded character and code
unit; it is a fixed-width character encoding form."

- James

Michael Marcin wrote:
> James Porter wrote:
>>
>> On a different note, does anyone see a practical use in having (mutable)
>> strings with variable-width character encodings? I can't think of any
>> practical use for them that wouldn't be equally well-served with an
>> array of bytes (like the email MIME-type example).
>>
>
> What encoding would you propose we use that is not variable length?
>
> UTF-8, UTF-16, and UTF-32 certainly are all variable length encodings.
>
>
> - Michael Marcin
>
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk