Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] [locale] Review results for Boost.Locale library
From: Ryou Ezoe (boostcpp_at_[hidden])
Date: 2011-04-26 21:47:07

Next message: Phil Bouchard: "Re: [boost] [Memory Managed Pointer] Review Request"
Previous message: Jeremy Maitin-Shepard: "Re: [boost] [locale] Review results for Boost.Locale library"
In reply to: Artyom: "Re: [boost] [locale] Review results for Boost.Locale library"
Next in thread: Ryou Ezoe: "Re: [boost] [locale] Review results for Boost.Locale library"

On Tue, Apr 26, 2011 at 9:27 PM, Artyom <artyomtnk_at_[hidden]> wrote:
>> From: Mathias Gaunard <mathias.gaunard_at_[hidden]>
>>
>> On 26/04/2011 11:17, Sebastian Redl wrote:
>>
>> > GCC has options to Â control both the source (-finput-charset) and the
>> > execution character Â set (-fexec-charset). They both default to UTF-8.
>> > However, MSVC is more Â complicated. It will try to auto-detect the source
>> > character set, but Â while it can detect UTF-16, it will treat everything
>> > else as the system Â narrow encoding (usually a Windows-xxxx codepage)
>> > unless the file starts Â with a UTF-8-encoded BOM. The worse problem is
>> > that, except for a very Â new, poorly documented, and probably
>> > experimental pragma, there is *no Â way* to change MSVC's execution
>> > character set away from the system Â narrow encoding.
>>
>> A long time ago, I asked Vladimir Prus to help me add an Â option to
>> Boost.Build that would allow to automatically prepend the BOM
>> to Â source files when using MSVC, but unfortunately he was never able to help
>>me do Â this.
>>
>
>
> The problem even if the source is UTF-8 with BOM "×©×œ×•×" would
> be encoded according to locale's 8bit codepage like 1255 or 936
> and not UTF-8 string (codepage 65001).
>
> It is rather stupid, but this is how MSVC works or understands
> the place of UTF-8 in this world.
>
> Unicode and Visual Studio is just broken...

I seriously concerns the author's ability to understand the real world
situation.
This library is not only useless, but also harmful for localization.
It encourage people to use ASCII.

The reason there are so many ASCII compatible encodings is, I think,
partly for quick workaround.
Many existing code expected ASCII. Unicode was not a viable solution
at that time.
In order to handle their language, they created a encoding that was
compatible with ASCII.
It worked most of the time.

No matter how hard you say "This library expect ASCII input and it's
programmer's responsibility to pass ASCII. Anything else is deserve to
be broken."
People use these ASCII compatible encodings for existing code.
Because, it works most of the time.

They want to use their language.
They want to use a encoding which can express their language.
So they use ASCII compatible encodings where ASCII is expected.

We have to get rid of ASCII.
What a shame a localization library which expect ASCII input.

>
> Artyom
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

-- 
Ryou Ezoe

Next message: Phil Bouchard: "Re: [boost] [Memory Managed Pointer] Review Request"
Previous message: Jeremy Maitin-Shepard: "Re: [boost] [locale] Review results for Boost.Locale library"
In reply to: Artyom: "Re: [boost] [locale] Review results for Boost.Locale library"
Next in thread: Ryou Ezoe: "Re: [boost] [locale] Review results for Boost.Locale library"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk