Boost logo

Boost :

Subject: Re: [boost] [locale] [filesystem] Windows local 8 bit encoding
From: Ryo IGARASHI (rigarash_at_[hidden])
Date: 2012-11-30 09:10:16


Hi, Jookia,

On Fri, Nov 30, 2012 at 7:35 PM, Jookia <166291_at_[hidden]> wrote:
> Sorry if this sounds silly, but what's the problem if we just stick to one
> mapping table consistently?

The round trip conversion is the problem. Suppose you write a program
which communicates to the legacy software which can only handle "NEC selection
of IBM extension" characters (See [1] for what this means; See [2] for
complete table).
If my new program convert the input to UTF-8 (1st) and convert back as
an input to
the legacy software (2nd), those characters are now "IBM extension" character,
which the legacy software fail to handle.

This problem is inevitable even if we stick to one mapping table and
avoidable when
I do not convert at all.

[1] https://en.wikipedia.org/wiki/Code_page_932
[2] http://www2d.biglobe.ne.jp/~msyk/charcode/cp932/uni2sjis.html (Japanese)

Best regards,

--
Ryo IGARASHI, Ph.D.
rigarash_at_[hidden]

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk