Boost :

Date view	Thread view	Subject view	Author view

From: Cory Nelson (phrosty_at_[hidden])
Date: 2008-03-10 23:36:24

Next message: Emil Dotchevski: "Re: [boost] shared_ptr in multithreaded environments"
Previous message: Stjepan Rajko: "Re: [boost] Review Request: Dataflow.Signals"
In reply to: Graham: "Re: [boost] UTF-8 conversion etc. (Cory Nelson)"

On Mon, Mar 10, 2008 at 3:22 PM, Graham <Graham_at_[hidden]> wrote:
> >> Sebastian,
>
> >>
>
> >>
>
> >>
>
> >> As Unicode characters that are not in page zero can require more
> than 32
>
> >> bits
>
> >>
>
> >> to encode them [yes really] this means that one 'character' can be
> very
>
> >> long
>
> >
>
> >Unicode defines codepoints from 0 to 10FFFF - this can be encoded with
>
> >32 bits in UTF-8 and UTF-16.
>
>
>
> Cory,
>
>
>
> This is true for simple characters, except that current Unicode specs
> require support for surrogates - which require twice that -and thats
> even before you start to discuss logical grouping of characters or
> graphemes which can themselves be two or three characters long.
>

A surrogate pair in UTF-16 takes up two code units for a total of 32
bits. UTF-8 does not have surrogates at all. What are you talking
about?

-- 
Cory Nelson

Next message: Emil Dotchevski: "Re: [boost] shared_ptr in multithreaded environments"
Previous message: Stjepan Rajko: "Re: [boost] Review Request: Dataflow.Signals"
In reply to: Graham: "Re: [boost] UTF-8 conversion etc. (Cory Nelson)"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk