Boost logo

Boost :

Subject: Re: [boost] [filesystem] path does not use global locale's codecvt facet - bug or feature
From: Yechezkel Mett (ymett.on.boost_at_[hidden])
Date: 2011-03-06 06:32:23


On Thu, Mar 3, 2011 at 8:29 PM, Beman Dawes <bdawes_at_[hidden]> wrote:
> On Thu, Mar 3, 2011 at 8:31 AM, Artyom <artyomtnk_at_[hidden]> wrote:
>> Hello,
>>
>> Boost.Filesystem v3 uses wide path under windows and can convert it from the
>> narrow
>> one using codecvt facet, so I would expect if the global locale is some locale
>> that has special codecvt facet installed boost.filesystem should use it, i.e.:
>>
>> int main()
>> {
>>   boost::locale::generator locale_generator;
>>   std::locale::global(locale_generator("en_US.UTF-8"));
>>   // Now default codecvt facet is UTF-8 one.
>>   boost::filesystem::path p("שלום.txt");
>>   boost::filesystem::ofstream test(p);
>> }
>>
>> However this does not work as expected!
>>
>> I had found that you need to imbue locale explicitly:
>>
>>
>>   boost::filesystem::path p;
>>   p.imbue(std::locale()); // global one
>>   p = "שלום.txt";
>>   boost::filesystem::ofstream test(p);
>>
>> Now it works.
>>
>> Should I open a ticket for this or this is "planned"
>> behavior?
>
> That depends. The docs recently (Feb 20, rev 69073) got updated to
> provide more detail. For Windows, including Cygwin and MinGW, this is
> part of what the docs say:
>
> "The default imbued locale provides a codecvt facet that invokes
> Windows MultiByteToWideChar or WideCharToMultiByte API's with a
> codepage of CP_THREAD_ACP if Windows AreFileApisANSI()is true,
> otherwise codepage CP_OEMCP. [Rationale: this is the current behavior
> of C and C++ programs that perform file operations using narrow
> character string to identify paths. Changing this in the Filesystem
> library would be too surprising, particularly where user input is
> involved. -- end rationale]"

It should use CP_ACP not CP_THREAD_ACP, because that's what the
Windows API functions (CreateFile etc) and the C library functions
(fopen etc) use. The C++ library functions (fstream::open etc) do in
fact use the C global locale (by way of mbstowcs if I remember
correctly).

(CP_THREAD_ACP should never be used for converting code pages - it's
based on the User Locale which is for sort orders and numeric
formats.)

Yechezkel Mett


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk