Boost logo

Boost :

Subject: Re: [boost] [locale] [filesystem] Windows local 8 bit encoding
From: Ryo IGARASHI (rigarash_at_[hidden])
Date: 2012-11-30 05:29:29


Hi Artyom,

On Thu, Nov 29, 2012 at 10:45 PM, Artyom Beilis <artyomtnk_at_[hidden]> wrote:
> If so there is no such a locale under windows that works with Shift_JIS...

Strictly speaking, you are right. You cannot use Japanese locale
with strict Shift_JIS character set on Windows.
However, all characters in Shift_JIS can be described in CP932
since the CP932 character set of is wider than Shift_JIS.

The text below may be off-topic for Boost.Locale, but
it might explain why (I believe) Japanese windows programmers
are reluctant to convert text to UTF-8 (on Windows).

If you have a CP932-encoded string, convert to UTF-8, and then
convert back to CP932 string, the first and the third string may be *different*.
This means that the original information is (somewhat) lost.

See the reference information from Microsoft:
http://support.microsoft.com/default.aspx?scid=kb;en-us;Q170559
(Note that 'Shift JIS' in the above link means CP932)

This means that in order to handle the Japanese string properly under Windows,
the programmers are encouraged not to convert at all.

Moreover, at least 2 major (and slightly different) Shift_JIS <->
UTF-8 mapping table exists.
i.e. the same Shift_JIS text will map to different UTF-8 string.
(2 which are provided by Unicode Consortium and Microsoft)

Best regards,

--
Ryo IGARASHI, Ph.D.
rigarash_at_[hidden]

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk