|
Boost : |
Subject: Re: [boost] [locale] Review results for Boost.Locale library
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-04-26 08:27:06
> From: Mathias Gaunard <mathias.gaunard_at_[hidden]>
>
> On 26/04/2011 11:17, Sebastian Redl wrote:
>
> > GCC has options to control both the source (-finput-charset) and the
> > execution character set (-fexec-charset). They both default to UTF-8.
> > However, MSVC is more complicated. It will try to auto-detect the source
> > character set, but while it can detect UTF-16, it will treat everything
> > else as the system narrow encoding (usually a Windows-xxxx codepage)
> > unless the file starts with a UTF-8-encoded BOM. The worse problem is
> > that, except for a very new, poorly documented, and probably
> > experimental pragma, there is *no way* to change MSVC's execution
> > character set away from the system narrow encoding.
>
> A long time ago, I asked Vladimir Prus to help me add an option to
> Boost.Build that would allow to automatically prepend the BOM
> to source files when using MSVC, but unfortunately he was never able to help
>me do this.
>
The problem even if the source is UTF-8 with BOM "ש×××" would
be encoded according to locale's 8bit codepage like 1255 or 936
and not UTF-8 string (codepage 65001).
It is rather stupid, but this is how MSVC works or understands
the place of UTF-8 in this world.
Unicode and Visual Studio is just broken...
Artyom
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk