Boost logo

Boost :

Subject: Re: [boost] [Filesystem] v3 path separator changes
From: Yakov Galka (ybungalobill_at_[hidden])
Date: 2013-03-25 15:11:42


On Sun, Mar 24, 2013 at 11:51 PM, Rob Stewart <robertstewart_at_[hidden]>wrote:
[...]

> On Mar 24, 2013, at 2:47 PM, Yakov Galka <ybungalobill_at_[hidden]> wrote:
> > In other words: boost::path is a very dump strong typedef for a string
> that magically does encoding conversions and has some syntactic operations
> > defined, like operator / that adds a *slash*. (...or a *backslash* on
> other platforms...)
>
> Yes, path is a glorified string class. A fair bit of its value will be
> lost when there's better Unicode string support in the standard library.
> That said, there's still value in an abstraction that permits assembling
> and decomposing paths.
>

Yes, but current path class does not abstract anything, as was pointed in
the original post.

[...]
> > What annoys me is that Boost.filesystem has a fairly good multiplatform
> implementation of filesystem operative functions, but which depends on this
> dumb path
> > class.
>
> It would be reasonable to support overloads accepting strings, and not
> just paths. The current rationale, I think, is to overcome the lack of any
> other Unicode support.
>

There are superior approaches for Unicode support: using UTF-8 narrow
chars.[1] Why superior? Because the type of path[n], path.c_str(),
path.string() etc. would not change from one system to the other. Portable
libraries hide the differences between platforms rather than propagating
them to the interfaces.

I admit that maybe not everyone may agree on using narrow strings. But
then, why shove your approach to those who don't like UTF-16 on Windows?
Boost.Filesystem v2 was much better in this aspect too. If you like UTF-16
paths; use wpaths, if I like UTF-8 paths then I would use (narrow) paths
with an appropriate locale embedded. And no, current library interface does
not count: if I want to read a UTF-8 path from a database, do some
arthmetic work on it, and store it back, for some reason the author of the
idea jumped in the middle and decided that because I may potentially pass
this string to Windows (and in this case I do not) he must convert my
string to UTF-16 as soon as I gave it to him.... What a nonsense. Why
cannot I choose the exact type I want to use: path<char, allocator,
etc...>...?

The correct solution to make fopen("narrow string") work with Unicode is to
require Unicode support in the standard, and this is a much simpler
solution that has a larger impact than any path library can do. In part
because it would solve it for C guys too, and in part because it requires
only changes to the spec and implementation, no new bloated interfaces
whatsoever..

[1] http://utf8everywhere.org/

-- 
Yakov

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk