Boost logo

Boost Users :

Subject: Re: [Boost-users] boost filesystem
From: Christopher (cpisz_at_[hidden])
Date: 2012-01-25 19:55:25


John Emmas <johne53 <at> tiscali.co.uk> writes:

>
>
> On 24 Jan 2012, at 20:00, Christopher wrote:
>
> > John Emmas <johne53 <at> tiscali.co.uk> writes:
> >
> >> I've just discovered to my dismay that Microsoft's implementation of
> > fstream, ifstream and ofstream are fatally flawed.
> >
> > How so?
> >
> > You fail to explain how it does anything other than what the standard
tells
> > you it does.
> >
> > Please give a code example of this flaw.
> >
>
> I discovered this afternoon that it's already been acknowledged by Microsoft
as a bug. It affects VC8 and 9
> but is fixed in VC10.
>
> http://connect.microsoft.com/VisualStudio/feedback/details/361133/a-call-to-
the-std-filebuf-open-method-with-a-multibyte-path-that-worked-in-vc7-fails-in-
vc9
>
> Since I'm using VC8 I'll be trying some workarounds tomorrow. I did try the
fstream implementation from
> boost::filesystem (v1.40, I think) but it seemed to have the same problem.
Thanks for your reply.

I still don't understand the problem, even after reading the MS connect write-
up. A full code example is not presented. std::fstream has a param that is a
const char *. Standard C++ streams are documented to use _ANSI C strings_ for
their arguments. If you are passing any unicode character, that falls outside
of the range shared by the ANSI character set, then you are passing an invalid
parameter.

If you use wfstream, you may then pass in a wide string with characters of the
UTF16LE encoding that share the same range as the ASCII set as an argument.

At the time of the last standard C++ did not consider unicode at all.
The new cx11 standard supposedly addresses it, with better locales and facets.

You will also find that regardless of project settings being unicode or not,
whatever text you pass into a stream and out to file is going to be
transformed to the character set your machine's environment is set to use. In
my case, I might pass in a UTF16LE encoded wstring to wfstream and open it
with a hex editor to find it has been transformed to Windows 1252. That is
also documented, but a pain in the arse.

So, I conceded that, in order to actually stream unicode to file, I'd have to
read and write it as bytes, and insert the BOM myself. It would seem, that for
the time being, "There is no standard C++ or Microsoft way to stream unicode
to file" (without some nuances).

This is all what I gleamed through my debugging woes with unicode. I could be
wrong.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net