Boost :

Date view	Thread view	Subject view	Author view

From: Graham (Graham_at_[hidden])
Date: 2008-03-10 18:22:20

Next message: Emil Dotchevski: "Re: [boost] shared_ptr in multithreaded environments"
Previous message: Mathias Gaunard: "[boost] shared_ptr in multithreaded environments"
Next in thread: Cory Nelson: "Re: [boost] UTF-8 conversion etc. (Cory Nelson)"
Reply: Cory Nelson: "Re: [boost] UTF-8 conversion etc. (Cory Nelson)"

>> Sebastian,

>> As Unicode characters that are not in page zero can require more
than 32

>> bits

>> to encode them [yes really] this means that one 'character' can be
very

>> long

>Unicode defines codepoints from 0 to 10FFFF - this can be encoded with

>32 bits in UTF-8 and UTF-16.

Cory,

This is true for simple characters, except that current Unicode specs
require support for surrogates - which require twice that -and thats
even before you start to discuss logical grouping of characters or
graphemes which can themselves be two or three characters long.

I am glad you recognise that normalisation support is difficult - that's
why we the character support library is the hard part to develop. I
guess we just ran out of steam after that.

Yours,

Graham

Next message: Emil Dotchevski: "Re: [boost] shared_ptr in multithreaded environments"
Previous message: Mathias Gaunard: "[boost] shared_ptr in multithreaded environments"
Next in thread: Cory Nelson: "Re: [boost] UTF-8 conversion etc. (Cory Nelson)"
Reply: Cory Nelson: "Re: [boost] UTF-8 conversion etc. (Cory Nelson)"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk