|
Boost : |
Subject: Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]
From: Sergey Cheban (s.cheban_at_[hidden])
Date: 2011-01-20 13:22:15
19.01.2011 18:34, Alexander Lamaison wrote:
> Even if I bought the UTF-8ed-Boost idea, what would we do about the STL
> implementation on Windows which expects local-codepage narrow
strings? Are
> we hoping MS etc. change these to match? Because otherwise we'll be
> converting between narrow encodings for the rest of eternity.
The problems with MSVC and multilingual filenames are not boost-related.
Even the following code don't work correctly:
#include <stdio.h>
int main( int argc, char *argv[])
{
printf("%s", argv[1]);
return 0;
}
>1.exe asdfÑÑва
asdfÐâÑÑ
As you can see, the cyrillic characters are broken (this is an ANSI vs
OEM issue and is not related to the unicode at all).
Please note that the cygwin compiler/libc has no such problems because
it uses utf-8 (by default, at least). The fopen() uses the utf-8 for
filenames, too.
So, we may choose one of the following:
1. Wait until MS fixes the problem on their side. For now, the windows
users may use the short filenames (i.e. GetShortPathName() ) for the
multilingual filenames.
2. Provide a char * interface that will allow the windows developers to
work with multilingual filenames.
3. Provide WCHAR * interface specially for the windows developers and
allow them to write the non-portable code. Leave the char * interface
unusable for windows/msvc and wait until MS fixes it on their side.
4. Create the almost-portable wchar_t * interface.
5. Create our own type (boost::native_t or boost::utf8_t) and conversion
routines for it. Please note that independent libraries will NEVER use
foreign non-standard types.
I think only 2nd and 3rd options are realistic.
-- Best regards, Sergey Cheban
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk