Boost logo

Boost :

Subject: Re: [boost] [filesystem] Request for comments on proposed relative() function
From: Yakov Galka (ybungalobill_at_[hidden])
Date: 2014-05-16 03:21:10


On Fri, May 16, 2014 at 3:53 AM, Gavin Lambert <gavinl_at_[hidden]>wrote:

> On 16/05/2014 00:44, quoth Yakov Galka:
>
> One problem with boost filesystem is the lack of theoretical foundation.
>>
>
> There kind of is one if you look at the Definitions section at
> http://www.boost.org/doc/libs/1_55_0/libs/filesystem/doc/reference.html.
> I'm not completely convinced that the implemented methods actually obey
> this though, nor that they are necessarily wrong in not doing so. :)

It is simply not enough. Every path-related wording is syntax related
rather than semantics defining. It does not define what path concatenation
means, thus leaving stuff like "c:" / "d:" unspecified. In the end, the
wording of operator / says simply that it "adds a separator when needed". I
believe that the word "separator" shall not even exist in the documentation.

In the best you can use the Definitions section to work with generic paths
only. There is nothing to infer what "[A.B]F.TXT"/"[D]" should do on
OpenVMS.

...
> relative("c:/a/b", "c:/") = "a/b"
>
Absolutely, this was a write-in-progress bug :)

  relative("c:a/b", "d:/") = "c:a/b"
>
Yes, this is implied by my definition: "d:/" / "c:a/b" = "c:a/b"

  relative("c:a/b", "c:/") = "a/b"
> ... Though I accept that this third case may be borderline; alternatively
> it should return the same as case 2.)
>

My point is that defining relative() in isolation is a way to no-where.
Your example does not satisfy "c:/" / "a/b" = "c:a/b", so it is incorrect.
It should be "c:a/b" or NaP, depending on the way you define concatenation.

> Also I'm dubious whether allowing the base-path of relative() to be itself
> relative is useful in any way. In particular the case
> relative(absolute-path, relative-path) seems nonsensical; at best it should
> probably return the unmodified absolute-path.
>

It is perfectly defined by my definition and returns the absolute-path.

Though I suppose for consistency with absolute/canonical, relative() could
> use absolute(base) as its base internally, which would produce reasonable
> results (albeit somewhat dependent on filesystem state -- but that's not
> unexpected when dealing with relative paths).

It is unexpected for me -- when I'm dealing with paths I don't want to
access the filesystem at all, unless I specifically say that.

> ...
> I'm not sure how the Linux filesystem in general behaves, but the shell
> typically appears to do the same thing -- cd into a symlinked folder
> followed by "cd .." gets you back to your original folder, not the parent
> of the symlink target.

What cd does depends on the shell. Don't know of Linux, but on FreeBSD +
sh, AFAIR, cd resolves the paths. Also if I cd to some filesystem node that
gets deleted, then I cannot "cd .." because the node does not exist
anymore. This is really annoying!

absolute(p,base) illogical
>
>> --------------------------
>>
>> The case when p.has_root_name() && !p.has_root_directory() makes no
>> sense. I remember that it was once left unspecified. Don't know why it
>> got defined now.
>>
>
> absolute("c:foo", "c:/bar") == "c:/bar/foo"
>
> This requires that case.

And absolute("c:foo", "d:/bar") == "c:/bar/foo"... It does not make sense.

Back to the theory, you could define "c:/a" / "c:" = "c:/a" and "c:/a" /
"d:" = "d:" (I think this is what SetCurrentDirectory, but not cd, do), and
then again, our abstract concatenation would do what absolute does, but
correctly.

> * canonical() -- a better name would be resolve(), cause what it
>> ultimately does is resolving symlinks? Naming it realpath() is also an
>> option.
>>
> ...
> (Perhaps system_complete()? Hard to tell from the docs, as it defines its
> behaviour in terms of a function that does not exist.)

Yes, system_complete, on Windows, resolves ".." and "c:a" correctly. And it
does not resolve symlinks, which is the correct thing on Windows.

> Rationale: by the definitions section, paths are abstract entities
>> detached from the filesystem. Asking for relative(x,y) or x/y are
>> legitimate questions on their own, even if the paths do not exist at the
>> specific point in time and space. Even when x and y are relative to a
>> yet-unknown base. Making them depend on the current filesystem state is
>> error prone.
>>
>
> I don't think anyone ever suggested relative() should require that the
> path exist. But I don't think that it's unreasonable for it to react to
> the current directory of each drive -- that's the purpose of relative
> paths. (And if you want it to be relative to some other folder, you should
> specify which one explicitly as an absolute path, and then there is no
> dependency on the filesystem.)

This all sounds good, but working with paths relative to an *unknown* base
is useful. Think of some project directory that tries to be relocatable.

 I think most of the points you brought up here aren't really on-topic in
> this particular thread, and would have been better made in a separate
> thread (or by writing your own alternative implementation). I doubt it's
> likely that grand sweeping changes to an existing accepted library would
> get anywhere. But that doesn't mean you couldn't submit an alternative
> intended to supersede it; that's happened in the past.
>

True, some of them are off-topic. And I do have an alternative path
implementation that I'm using myself, which I might release some day.
However, boost.filesystem already undergone three major versions, and it is
actively pushed to being standardized. So fixing it might be more logical
than introducing another library, that fixes those concrete problems but
presents an entirely different, likely controversial, approach.

-- 
Yakov

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk