|
Boost Users : |
Subject: Re: [Boost-users] Inconsistent unicode encoding between boost and wx on mac osx
From: Lars Viklund (zao_at_[hidden])
Date: 2010-02-13 16:36:46
On Sun, Feb 14, 2010 at 02:50:43AM +0530, Sachin Garg wrote:
> My project uses both boost and wxwidgets and unicode encoding by both
> is different on Mac OSX. Everything works fine on windows.
>
> Problem: Boost and WX do end up encoding the strings differently when
> converting to unicode on OSX. I am detailing an example:
>
> WX's encoding is same on both windows and osx but Boost's encoding is
> different on both platforms. It is probably not a bug but I am unable
> to figure out the reason and how to make them both work together.
> Hex dumps of Unicode encodings of this string
Unicode has a bunch of different Normalization Forms [1].
A normalization form tells how diacritics and composite codepoints
should be composed or decomposed when represented.
The choice of NF is up to the OS, most importantly, OSX and Windows does
it differently. The encoding of your strings seems to be the same,
they're just composed differently.
Boost likely uses OS functions to convert between encodings while I
assume that WX uses its own internally consistent transcoding.
[1] http://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms
-- Lars Viklund | zao_at_[hidden]
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net