Subject: Re: [boost] [locale] Review results for Boost.Locale library
From: Ryou Ezoe (boostcpp_at_[hidden])
Date: 2011-04-26 08:41:23
On Tue, Apr 26, 2011 at 9:27 PM, Artyom <artyomtnk_at_[hidden]> wrote:
>> From: Mathias Gaunard <mathias.gaunard_at_[hidden]>
>> On 26/04/2011 11:17, Sebastian Redl wrote:
>> > GCC has options to Â control both the source (-finput-charset) and the
>> > execution character Â set (-fexec-charset). They both default to UTF-8.
>> > However, MSVC is more Â complicated. It will try to auto-detect the source
>> > character set, but Â while it can detect UTF-16, it will treat everything
>> > else as the system Â narrow encoding (usually a Windows-xxxx codepage)
>> > unless the file starts Â with a UTF-8-encoded BOM. The worse problem is
>> > that, except for a very Â new, poorly documented, and probably
>> > experimental pragma, there is *no Â way* to change MSVC's execution
>> > character set away from the system Â narrow encoding.
>> A long time ago, I asked Vladimir Prus to help me add an Â option to
>> Boost.Build that would allow to automatically prepend the BOM
>> to Â source files when using MSVC, but unfortunately he was never able to help
>>me do Â this.
> The problem even if the source is UTF-8 with BOM "×©×××" would
> be encoded according to locale's 8bit codepage like 1255 or 936
> and not UTF-8 string (codepage 65001).
> It is rather stupid, but this is how MSVC works or understands
> the place of UTF-8 in this world.
It's not stupid.
It's because ANSI version of Win32 API expect these encodings.
To me, encoding of ordinary string literal use source file's encoding
is a stupid idea.
> Unicode and Visual Studio is just broken...
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
-- Ryou Ezoe
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk