Boost logo

Boost :

Subject: Re: [boost] [locale] Review results for Boost.Locale library
From: Ryou Ezoe (boostcpp_at_[hidden])
Date: 2011-04-25 15:50:48


On Tue, Apr 26, 2011 at 3:55 AM, Artyom <artyomtnk_at_[hidden]> wrote:
>> From: Ryou Ezoe <boostcpp_at_[hidden]>
>>
>> Sort by code point is not the best  solution.
>> But at least, it's consistent if we use one  encoding.
>>
>
> No it is not, UCS encoding has different order
> in different representations:
>
> UTF-8 and UTF-32 order is consistent i.e.
>
>   for each a,b in utf8(a) < utf8(b) iff utf32(a) < utf32(b)
>
> However this is not correct for UTF-16 where codepoints
> outside of BMP has different ordering. i.e.
>
> It may be that codepoint (a) > codepoint(b) but UTF-16(a) sorted before
> UTF-16(b)

What do you mean?
No matter what UTF you use.
Code point is same.
You can't compare UTF-8 string by comparing each octet.

>
> Artyom
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
>

-- 
Ryou Ezoe

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk