|
Boost : |
Subject: Re: [boost] [filesystem] path does not use global locale's codecvt facet - bug or feature
From: PB (newbarker_at_[hidden])
Date: 2011-03-18 07:09:57
On Sun, Mar 6, 2011 at 4:52 PM, Beman Dawes <bdawes_at_[hidden]> wrote:
> On Sun, Mar 6, 2011 at 6:32 AM, Yechezkel Mett <ymett.on.boost_at_[hidden]> wrote:
>> On Thu, Mar 3, 2011 at 8:29 PM, Beman Dawes <bdawes_at_[hidden]> wrote:
> ...
>>> "The default imbued locale provides a codecvt facet that invokes
>>> Windows MultiByteToWideChar or WideCharToMultiByte API's with a
>>> codepage of CP_THREAD_ACP if Windows AreFileApisANSI()is true,
>>> otherwise codepage CP_OEMCP. [Rationale: this is the current behavior
>>> of C and C++ programs that perform file operations using narrow
>>> character string to identify paths. Changing this in the Filesystem
>>> library would be too surprising, particularly where user input is
>>> involved. -- end rationale]"
>>
>> It should use CP_ACP not CP_THREAD_ACP, because that's what the
>> Windows API functions (CreateFile etc) and the C library functions
>> (fopen etc) use. The C++ library functions (fstream::open etc) do in
>> fact use the C global locale (by way of mbstowcs if I remember
>> correctly).
>>
>> (CP_THREAD_ACP should never be used for converting code pages - it's
>> based on the User Locale which is for sort orders and numeric
>> formats.)
>
> Hum... I marked your previous message
> http://lists.boost.org/Archives/boost/2010/11/173382.php for action,
> and then never did anything about it. Sorry, my mistake.
>
> It would be very confusing and error prone if std::fstream,
> boost::filesystem::fstream, and boost::filesystem operational
> functions treat a narrow string filename differently. Since
> std::fstream can't be changed, that implies whatever a given standard
> library does should be what boost filesystem does.
>
> That further implies that if library A does it one way, and library B
> does it a different way, boost filesystem should do it the way
> standard library version does it, even if that means a program using
> filesystem compiled with VC++ could behave differently than if
> compiled with some other compiler.
>
We've just ran into this problem too. Our application is running under
Chinese Windows. The C locale is set using std::setlocale(LC_CTYPE,"")
to match the active code page of the operating system (Chinese), but
the thread locale is adjusted to English to select the English
resources, not the Chinese ones. A filename selected (e.g. from a file
open dialog box) that include Chinese characters will fail if a
boost::filesystem::path object is constructed from the const char*
ANSI path. If we just pipe the const char* directly into std::ifstream
then it works ok.
We had to patch Boost.Filesystem locally to use CP_ACP rather than
CP_THREAD_ACP. Would love to see this officially changed.
Regards,
Pete
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk