Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-01-14 10:27:49
> I'm opposed to this strategy simply because it differs from the way
> existing libraries treat narrow strings. Not least the STL. If you open
> an fstream with a narrow filename, for instance, this isn't treated as a
> UTF-8 string. It's treated as being in the local codepage.
First of all, neither in C++/03 nor in C++0x you can
open a file stream with wide file name. MSVC provides
non-standard extension but it does not exist in other compilers
So using C++ you can't open a file called: "×©×××-Ø³ÙØ§Ù
under Microsoft Windows.
You can use OS level API like _wfopen to do this job using wide
string. But you can't to do this in C++. Period.
The idea is following:
1. Provide replacement for system libraries that actually
use text and relate to it as text in some encoding.
For STL and standard C library it would be filesystem API.
So you need to provide something like boost::filesystem::fstream
2. Make all boost libraries use Wide API only and never call ANSI API.
3. Treat narrow strings as UTF-8 and convert then to wide prior system calls.
> While this behaviour isn't great, it is standard.
If the standard it bad, leads to unportable and platform
incompatible code it should not be used!
You can always provide a fallback like boost::utf8_to_locale_encoding
if you have to use ANSI API.
But generally you should just use something like boost::utf8_to_utf16
and always call Wide API.
You must not use ANSI API under Windows.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk