Boost logo

Boost :

From: Dave Harris (brangdon_at_[hidden])
Date: 2004-11-16 14:57:05

In-Reply-To: <[hidden]>
bdawes_at_[hidden] (Beman Dawes) wrote (abridged):
> Your strongest argument IMO is the point about conversions not
> necessarily being value preserving.

I believe conversions from char to wchar_t are always value preserving, or
can be made to be.

I've known MultiByteToWideChar() to fail with CP_SYMBOL on Win98, but it's
easy to handle that as a special case (and replicate the XP behaviour). I
would be interested to hear of any other counter-examples.

> (I guess we could tell Windows users that they should not expect
> such conversions to work unless supported by the applicable
> codepage. But that seems spin rather than a real solution.)

I don't think there's any magic here. WideCharToMultiByte() is usually a
misnomer, as most code pages only support 256 characters. Even the
double-byte ones cannot support the whole of 16-byte Unicode, let alone
surragate pairs. Typically converting an unsupported character will yield
a question mark, which is not valid in a file name.

So if we do the conversion, the unsupported characters will fail in our
code, and if we don't, it will fail in the OS. There's no magic to make it
work. When "OS" means "Win98+MSLU", in my experience it is better to
handle the conversion explicitly as MSLU doesn't always do what I'd want.
I have sometimes found the best way to convert a Unicode path to ANSI is
via GetShortPathName().

This isn't an argument against using two path classes. It is an argument
against wpath relying on MSLU.

In my Unicode apps an important use case was passing Unicode filenames to
other, ANSI apps and libraries. So if we do have two path classes we will
still need to offer conversions between them. Probably this conversion
should match the one you get when making OS calls, which is another
argument for doing the conversion ourselves.

Having 2 classes is probably the clearest way to manage such cases.

-- Dave Harris, Nottingham, UK

Boost list run by bdawes at, gregod at, cpdaniel at, john at