Boost logo

Boost :

Subject: Re: [boost] [filesystem] proposal: treat reparse files as regular files
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2015-08-03 12:31:27


On 3 Aug 2015 at 16:38, Gavin Lambert wrote:

> > There is a single page "cheat sheet" at
> > https://boostgsoc13.github.io/boost.afio/doc/html/afio/overview.html.
>
> It would be nice if this included hyperlinks for the local types. I
> have no idea what a directory_entry looks like.

Fixed. Each operation on the cheat sheet now lists what types are
related to it.

> (And even after manually navigating around the docs until I found
> https://boostgsoc13.github.io/boost.afio/doc/html/afio/reference/classes/directory_entry.html,
> I still have no idea what those fields actually *mean*. Only because
> you mentioned it below did I also find
> https://boostgsoc13.github.io/boost.afio/doc/html/afio/reference/structs/stat_t.html,
> which is more descriptive. Although I later went back and noticed I
> overlooked fetch_lstat on directory_entry. Another case where
> hyperlinks would have been nice.)

Fixed. Each reference page now links to related types too.

> >> but wouldn't it make the most
> >> sense to report other name surrogates as symlinks as well (via an "is
> >> this a symlink" or "get file type" method), but then if queried for the
> >> target of an unknown symlink type it will return/throw a "not supported"
> >> error?
>
> Using the above vocabulary, it seems to me that:
>
> - enumerate() / lstat() should be able to report all name surrogates
> as symlinks, however that is currently done (presumably via st_type ==
> symlink_file). Other reparse types should be reported as regular
> files/directories.

I would prefer to not report something as a symlink when target()
won't work with it. So you now have an additional stat_t flag called
st_reparse_point which is always the FILE_ATTRIBUTE_REPARSE_POINT
flag.

> - symlink() should be able to open unknown symlinks (since that's
> just a flag to CreateFile).

This works.

> - rmsymlink() should be able to delete unknown symlinks.

This works.

> - target() should work for the known symlink types and fail "not
> supported" (or similar) for the other name surrogate types, and fail
> "invalid operation" (or similar) for any non-reparse file or
> non-name-surrogate type.

This works. Unknown symlink types return an EINVAL error as per
POSIX.

> Does that sound reasonable?

Yes :)

> I suppose another variant on this would be to report known-type symlinks
> as st_type == symlink_file, unknown-type name surrogates as st_type ==
> type_unknown, and any other reparse point as st_type ==
> regular_file/directory_file. This would have the advantage of hinting
> whether target() is likely to work, but the disadvantage of being a bit
> misleading.
>
> (On a peripherally related note, it seems odd that Boost.Filesystem's
> file_type appears to lack a way to express "a symlink to a directory",
> which should be opened as a directory instead of as a file. Is this a
> POSIX limitation, that you're required to inspect the target to
> determine whether it's a file or directory? I know that Windows
> provides this up-front, both for junctions and for actual symlinks,
> which in turn means that if you do want to follow directory symlinks
> then you can just open them as regular directories without fanfare. Of
> course, that's also partly why symlinks are discouraged on Windows,
> because naive enumeration code will follow them by default and hilarity
> can ensue.)

Filesystem is trapped by POSIX however, and POSIX treats symlinks as
a special thing onto themselves.

AFIO is a bit caught here too actually. If you're enumerating a
directory you have no easy way of disambiguating between a symlink to
a directory and a symlink to a file. You basically have to try
opening it as a directory, and if it errors out you then open it as a
file.

Windows does supply what kind of symlink it is without additional
syscalls, but POSIX does not. You'd have to do two syscalls per entry
to disambiguate which is very costly for something so niche.

> > I am not adverse to adding a "st_reparse_point" field to stat_t. This
> > would let client code do its own detection on Windows. Does this work
> > for you?
>
> I don't personally have a use case, so I can't really answer the last
> question. As I said I'm coming at this thread from a design standpoint
> rather than a practical one. (And the original focus of the thread was
> on Filesystem rather than AFIO.)
>
> Having said that, more information never hurts; but I think this should
> be in addition to the behaviour described above, not instead of it.

Well you've got a st_reparse_point field now, so you can detect
reparse points which aren't those understood by AFIO and special case
them if you so desire.

The key aim for AFIO is as consistent a POSIX filesystem semantics as
is possible portably. As mentioned in earlier threads, any real world
use of async file i/o is going to need #ifdef for platforms anyway as
filing systems are so different, but where I can eliminate that I
will.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk