Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] RFC: interest in Unicode codecs?
From: Esben Mose Hansen (boost_at_[hidden])
Date: 2009-02-14 06:48:38

Next message: Andrey Semashev: "Re: [boost] [log] Review-ready version in the Vault"
Previous message: Alexander Arhipenko: "Re: [boost] [log] Review-ready version in the Vault"
In reply to: Graham: "Re: [boost] RFC: interest in Unicode codecs?"
Next in thread: Phil Endecott: "Re: [boost] RFC: interest in Unicode codecs?"

On Saturday 14 February 2009 11:53:20 Graham wrote:
> Using UTF-8 can work well if you are only targeting American and Western
> Europe for non-literary use.
>
> If you need to support the rest of the world you really need to move to
> UTF-32 due to the large number of characters and the grapheme and glyph
> handling [e.g. in Urdu you can type 3 characters and they are displayed
> as a single combined glyph, and the cursor should never be placed
> between them].

I think you have gotten something mixed up. UTF-8 and UTF-32 (aka UCS4) are
just two encodings of the same character set, including the combining you
mentioned (which are really not that uncommon, e.g. mêlée contains 2
characters which could be written by combining glyphs. In practical terms,
UTF-32 is somewhat useless. (A case might be made for UTF-16, though)

-- 
Kind regards, Esben

Next message: Andrey Semashev: "Re: [boost] [log] Review-ready version in the Vault"
Previous message: Alexander Arhipenko: "Re: [boost] [log] Review-ready version in the Vault"
In reply to: Graham: "Re: [boost] RFC: interest in Unicode codecs?"
Next in thread: Phil Endecott: "Re: [boost] RFC: interest in Unicode codecs?"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk