Boost logo

Boost :

Subject: Re: [boost] [locale] Formal review of Boost.Locale library EXTENDED
From: Ryou Ezoe (boostcpp_at_[hidden])
Date: 2011-04-19 17:30:09


On Wed, Apr 20, 2011 at 12:51 AM, Sergey Cheban <s.cheban_at_[hidden]> wrote:
> 19.04.2011 6:14, Ryou Ezoe пишет:
>
>> I bet Non-english people will use this library with non-english language.
>>
>> translate("日本語")
>
> Are there any problems with translate(ShiftJisToUtf8("日本語"))?

I'd rather wait for compilers support u8 encoding prefix.

The problem of that code is we don't have the rule how to map
shift-jis characters to UCS code points.
Mapping rule slightly differs in every libraries.
That is, some shift-jis characters are mapped to different UCS code
point in different libraries.

Actually, simply saying "shift-jis" is not right.
There is no such encoding like "shift-jis".
There are many shift-jis variants.
Windows use CP932.
Mac use MacJapanese.
JIS(Japanese Industrial Standards) defined ISO-2022-JP standard.
These are slightly different so mapping problem happens.
And each libraries handles it in their own way.
So it's like there is no THE consistent rule.
This is worse than UCS normalize problem.

We shouldn't use shift-jis anymore.
Converting from shift-jis is not recommended.
We should use one of UCS encoding directly.
That's why I don't say Boost.locale should handle all shift-jis
variants, JIS(this is yet another standard. not one of shift-jis
variant including ISO-2022-JP), EUC-JP and other encodings that have
been ever used at some point in the history.

>
>
> --
> Sergey Cheban
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost

-- 
Ryou Ezoe

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk