Boost logo

Boost :

Subject: Re: [boost] [filesystem] path does not use global locale's codecvt facet - bug or feature
From: Yechezkel Mett (ymett.on.boost_at_[hidden])
Date: 2011-03-22 08:23:25


On Sun, Mar 6, 2011 at 6:52 PM, Beman Dawes <bdawes_at_[hidden]> wrote:
> On Sun, Mar 6, 2011 at 6:32 AM, Yechezkel Mett <ymett.on.boost_at_[hidden]> wrote:
>> On Thu, Mar 3, 2011 at 8:29 PM, Beman Dawes <bdawes_at_[hidden]> wrote:
> ...
>>> "The default imbued locale provides a codecvt facet that invokes
>>> Windows MultiByteToWideChar or WideCharToMultiByte API's with a
>>> codepage of CP_THREAD_ACP if Windows AreFileApisANSI()is true,
>>> otherwise codepage CP_OEMCP. [Rationale: this is the current behavior
>>> of C and C++ programs that perform file operations using narrow
>>> character string to identify paths. Changing this in the Filesystem
>>> library would be too surprising, particularly where user input is
>>> involved. -- end rationale]"
>>
>> It should use CP_ACP not CP_THREAD_ACP, because that's what the
>> Windows API functions (CreateFile etc) and the C library functions
>> (fopen etc) use. The C++ library functions (fstream::open etc) do in
>> fact use the C global locale (by way of mbstowcs if I remember
>> correctly).
>>
>> (CP_THREAD_ACP should never be used for converting code pages - it's
>> based on the User Locale which is for sort orders and numeric
>> formats.)
...
> It would be very confusing and error prone if std::fstream,
> boost::filesystem::fstream, and boost::filesystem operational
> functions treat a narrow string filename differently. Since
> std::fstream can't be changed, that implies whatever a given standard
> library does should be what boost filesystem does.
>
> That further implies that if library A does it one way, and library B
> does it a different way, boost filesystem should do it the way
> standard library version does it, even if that means a program using
> filesystem compiled with VC++ could behave differently than if
> compiled with some other compiler.
>
> Does that make sense?

It does, though one could argue that boost should always do it one
way, even across multiple implementations that work differently, for
portability reasons. (I consider consistency and portability reasons
to use boost over the implementation supplied library.)

Note that consistency with the Windows API is more likely what the
user would expect, unless he's been bitten by the std::fstream
behaviour already.

I would recommend providing the correct incantation to set the global
locale to the Windows ANSI codepage as a note in the documentation -
it's not obvious:

std::locale::global(std::locale(str(boost::format(".%||") %
GetACP()).c_str(), LC_CTYPE));

> So a test case is needed that will distinguish between the C++
> standard library fstream using CP_ACP, C global locale, or something
> totally different.
>
> Do you already have such test code or could you put something together?

I don't have test code (I traced through the library in the debugger
to work out what it was doing), and for licencing reasons I don't
think I could provide such code if I did create it.

I would imagine the way to test it would be to create files with known
names and content in varying codepages (using CreateFileW or by
supplying files with the test), and attempting to open the files with
fstream whilst varying the locale.

Yechezkel Mett


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk