Boost logo

Boost :

Subject: Re: [boost] [general] What will string handling in C++ look like in the future
From: Peter Dimov (pdimov_at_[hidden])
Date: 2011-01-21 11:11:05

Beman Dawes wrote:
> std::codecvt<wchar_t, char, mbstate_t> is the type, but for windows the
> actual object used is a custom codecvt that uses Windows
> MultiByteToWideChar() for the ANSI or OEM codepage, as determined by
> AreFileApisANSI().

This is not what

says. I can't find this custom codecvt in the v3 source either, but I
haven't looked very hard.

> But your point is correct, but only if you believe defaulting to the
> platform's usual open/fopen() behavior is the wrong thing.

It's not really a matter of belief. Using an "ANSI" path on Windows is
objectively a wrong thing, unless you are forced to do so by a library.
"ANSI" paths can't represent Windows paths properly.

Now, it is not objectively a wrong thing to make a library default to ANSI
when given a narrow string, because this is what 90% of programmers would
expect. In fact, if a function's documentation doesn't state how it
interprets a narrow path string, I would assume ANSI as well - this is how
it is. Fine. But this design decision makes v3::path unsuitable for people
who don't want their strings to be treated silently as ANSI paths (via the
implicit conversion) because this hides logic errors.

My current preference is for a Windows path class to provide path::from_ansi
and path::from_utf8; the implicit constructor, if present, would default to
UTF-8, although omitting it would be much less controversial, and I don't
really insist on having it.

> What I'm suggesting is that people who want to use Unicode use wchar_t
> strings now, and char16_t or char32_t strings in C++0x.

This is worse than using UTF-8 on Windows from a portability standpoint, as
I've explained in my previous post.

Boost list run by bdawes at, gregod at, cpdaniel at, john at