Boost logo

Boost :

Subject: Re: [boost] [nowide] Request for interest (nowide unicode support for windows)
From: Artyom (artyomtnk_at_[hidden])
Date: 2010-06-17 02:50:39


> > There are more. What about filesystem::remove and others?
> > From what I see in the code, it supports only path and not wpath
>
> Really? I doubt that. In FSv2 it takes a template path:

I was talking about v3

> This delegates to RemoveFileA if passed a path and RemoveFileW if passed a
> wpath. glibc++/MinGW presumably uses the posix_remove API so this does,
> again, suffer from the problem. We could work around it in boost though I
> can't help but feel this is a MinGW problem: if it wants to work the
> windows way is should provide wide APIs as well, if it wants to pretend
> it's POSIX is should interpret narrow strings as UTF-8.

It is not about "pretending to work on POSIX"

GCC's stdlibc++ uses CRTL, same as if you call stdlib remove it would
use DeleteFileA and if you use _wremove it would call DeleteFileW.

And you can use _wremove in MinGW as it is CRTL's function.

This has absolutely nothing to do with POSIX

> We could potentially fix this in Filesystem v3 if it interpreted incoming
> narrow strings as UTF-8. Then you could create a 'path' using whichever
> type of string you like and the boost::filesystem functions would 'just
> work' (ok, issues with MinGW but nothing we can't work around by
> incorporating your code).

This would be very good solution..

> It's gone in v3.

Very good.

> > So... Just create an API that is friendly to UTF-8 strings and
> > forget about this hell.
>
> +1 from me with one modification: don't prevent using wide path on Windows.
> Often you will need to pass a wide path that you get from somewhere else
> and it would be a pain if we had to convert these to UTF-8 manually.

Agree. if windows users want to use wide path, let them, but this code
would be Windows only.

> Why? Boost.Filesystem v3 almost does all of this already. It would need
> two changes to make it work exactly as you want:
>
> - Interpret narrow strings as UTF-8 by default on Windows (the user
> could always imbue it with the local code page facet if the really
> wanted to interact with the 'A' versions of Windows APIs).
>

This is not solution:

Windows had never supported, does not support according to Lars Viklund
links it seems like it will never be supported. See this quote:

>
> Judging by assorted postings by Michael Kaplan (Unicode Grandmaster at
> Microsoft), there seems to be much fun to be derived from trying to use
> the UTF-8 codepage with narrow APIs.
>
> [1] http://blogs.msdn.com/b/michkap/archive/2006/07/14/665714.aspx
> [2] http://blogs.msdn.com/b/michkap/archive/2006/10/11/816996.aspx
> [3] http://blogs.msdn.com/b/michkap/archive/2006/03/13/550191.aspx
> [4] http://blogs.msdn.com/b/michkap/archive/2007/05/11/2547703.aspx
>
> Lars Viklund

So the only way to do the thing right is **always** use
Wide API on windows and convert normal strings to wide one just before
calling apropriate API functions.

> - Work around the MinGW 'bug' by incorporating some of your code.
>

I just want to be clear... This is not a bug (I know you put it in quotes).
This is what C++ says... std::basic_streambuf, **does not** have
open() member function that receives wide strings.

Artyom

      


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk