|
Boost : |
Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-01-14 10:27:49
>
> -1
>
> I'm opposed to this strategy simply because it differs from the way
> existing libraries treat narrow strings. Not least the STL. If you open
> an fstream with a narrow filename, for instance, this isn't treated as a
> UTF-8 string. It's treated as being in the local codepage.
>
First of all, neither in C++/03 nor in C++0x you can
open a file stream with wide file name. MSVC provides
non-standard extension but it does not exist in other compilers
like GCC/MinGW.
So using C++ you can't open a file called: "ש×××-سÙاÙ
-pease-ÐиÑ.txt"
under Microsoft Windows.
You can use OS level API like _wfopen to do this job using wide
string. But you can't to do this in C++. Period.
The idea is following:
1. Provide replacement for system libraries that actually
use text and relate to it as text in some encoding.
For STL and standard C library it would be filesystem API.
So you need to provide something like boost::filesystem::fstream
2. Make all boost libraries use Wide API only and never call ANSI API.
3. Treat narrow strings as UTF-8 and convert then to wide prior system calls.
>
> While this behaviour isn't great, it is standard.
>
If the standard it bad, leads to unportable and platform
incompatible code it should not be used!
You can always provide a fallback like boost::utf8_to_locale_encoding
if you have to use ANSI API.
But generally you should just use something like boost::utf8_to_utf16
and always call Wide API.
You must not use ANSI API under Windows.
Artyom
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk