Boost logo

Boost :

Subject: Re: [boost] [locale] Formal review of Boost.Locale library EXTENDED
From: Sergey Cheban (s.cheban_at_[hidden])
Date: 2011-04-20 05:44:53


20.04.2011 1:30, Ryou Ezoe пишет:

>>
>> Are there any problems with translate(ShiftJisToUtf8("日本語"))?
>
> I'd rather wait for compilers support u8 encoding prefix.
>
> The problem of that code is we don't have the rule how to map
> shift-jis characters to UCS code points.
It's not a problem providing the ShiftJisToUtf8 implementation, the
source file encoding and the translation file data are consistent.

> Mapping rule slightly differs in every libraries.
> That is, some shift-jis characters are mapped to different UCS code
> point in different libraries.
>
> Actually, simply saying "shift-jis" is not right.
> There is no such encoding like "shift-jis".
> There are many shift-jis variants.
> Windows use CP932.
> Mac use MacJapanese.
> JIS(Japanese Industrial Standards) defined ISO-2022-JP standard.
> These are slightly different so mapping problem happens.
> And each libraries handles it in their own way.
> So it's like there is no THE consistent rule.
All you need is to choose the library that is consistent with your
source file encoding. For MSVC/Windows, it is probably OK to convert
from CP_932 to CP_UTF8 using MultiByteToWideChar and WideCharToMultiByte.

--
Sergey Cheban

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk