Boost logo

Boost :

From: Beman Dawes (bdawes_at_[hidden])
Date: 2002-09-19 19:26:28


At 04:23 PM 9/19/2002, Thomas Witt wrote:

>I propose the following changes to the filesystem path interface in
>order to solve the problem of rooted/absolute paths.

Before getting into details, overall I think I like your idea of treating
the concept of a "root" separately, with some operations that operate on
"roots". This separation from the "root directory" concept is helpful.

Be warned, however, that a number of past ideas had to be abandoned after a
lot of work went into them. This stuff is messy because of the portability
issues.

I'm going to reorder some of your messages to start for ease of discussion.

>I'd like to introduce the notion of a path 'root' and of an 'absolute
>path'.
>
>Root:
>A filessystem can have multiple roots(e.g. C:;D:) each filesystem has at
>least
>one root. A path can be rooted. That means it belongs to the hierarchie
of
>its root.

Also, a rooted path can be either absolute or relative, as can be a
non-rooted path.

> class path
> {
> ...
> /*
> * Returns an empty root for
> * non rooted pathes.
> * Otherwise a non empty root
> */
> Root const root() const;
> ...
> };
>
>

Good, but see the comment below about the return value.

We also need to be very sure of each element in the path model for the
following cases (assume Windows). Assume leading "/" gets added to the
grammar as "root directory".

    p.generic_path() has_root() has_root_dir() Elements
    ---------------- ---------- -------------- --------
    "c:" true false c:
    "/" false true /
* "c:/" true true c:,/
    "foo" false false foo
    "/foo" false true /,foo
    "c:foo" true false c:,foo
* "c:/foo" true true c:,/,foo
** "//share" true true //share
* "//share/" true true //share,/
* "//share/foo" true true //share,/,foo

Cases marked (*) represent a change from the current specs in that they
separate the concept of "root" from "root directory".

Note that the definition of is_absolute() is has_root() && has_root_dir(),
but this may be too complicated. Maybe better to replace has_root_dir()
above with is_absolute.

I'm not 100% sure (**) is a valid case, but that isn't important.

> * Root that refer to the same filesystem
> * hierarchy are guaranteed to compare equal.

>The path iterator no longer references the root element of the
>path. I.e. iterating over a path means iterating over path elements
>as defined in the grammar. From my experience that would greatly
>simplify path iteratoin.

Sometimes users will want iteration to include the root, sometimes they
won't. Ditto root_directory(). Rather than exclude the root and/or
root_directory from iteration, I'd say provide bool functions to say if
they are present (has_root(), has_root_directory()), and extraction
functions.

The case of "/" on POSIX, which is both, would need to be handled
carefully. Because of that, I'd like to work the above table out again,
assuming "/" wasn't added to the portable grammar, and see if that
eliminated the need for the "root directory" concept and functions.

>Roots are modelled by a newly introduced root type that
>provides the following public interface. An access method
>is added to path to retrieve the root. Whether Root has to be default
>constructable needs further investigation.
>
>namespace boost {
> namespace filesystem {
>
> class Root
> {
> public:
> bool empty() const;
> };
>
> bool operator==(Root const&, Root const&);
> bool operator!=(Root const&, Root const&);
> bool operator<(Root const&, Root const&);

Doesn't work. For many file systems there is not way to tell if two roots
are equal, because of aliases.

It really seems to me that Root reduces down to a std::string, just like
any other element in a path.

>Absolut Path:
>An absolute path describes a fixed named location in a filesystem
>hierarchy. Note that the location is fixed with respect to the name
>not the referenced entity. Its counterpart is the relative path. An
absolute
>
>path must have a non-empty root. A relative path does not have to be
rooted
>but it can be rooted. Forinstance on Windows C:xy.txt is a relative
rooted
>path.

>To support the notion of relative/absolute paths the following methods
are
>added/modofied. Whether these should be members/nonmembers needs further
>investigation

Here is the rule I have been using to determine the member/nonmember
questions: If the function operates on a path purely at the lexical level,
without calls to the operating system, then it is a path member
function. If the function operates on the underlying filesystem (by
calling the operating system), then it is a non-member function.

>
>namespace boost {
> namespace filesystem {
>
> class path
> {
> ...
> /*
> * Returns true if path is absolute
> */
> bool is_absolute() const;

Why is_absolute() rather than is_relative()? Which would be clearer in the
corner cases like Windows "c:foo", "/foo"?

> /*
> * Precondition p.root().empty()
> */
> path& <<=(path const& p);
>
> /*
> * Precondition p.root().empty()
> */
> path const operator<<(path const& p)

With composition functions like those below, the <<= and << preconditions
can probably be tightened. You were smart to realize that:-)

> ...
> };
>
> /*
> * Returns an absolute path by either
> * prepending the current working directory
> * of the application, or by inserting the root
> * specific working directory for rooted paths.
> * Does nothing if p is absolute.
> */
> path const absolute(path const& p);
>
> // We may want to provide the following
> // convenience method
> /*
> * Merges p1 and p2.
> *
> * Precondition ((p1.root() == p2.root())
> || p2.root().empty()
> || p1.root().empty) )
> if both are absolute one must be a sub path of the
>other.
> */
> path const merge(path const& p1, path const& p2);

I'd rather settle the other questions first, then look at the additional
functions.

>Comments ?

Notice that none of the above makes any substantial change to the current
library. It is all additional functionality to make dealing with roots and
relative/absolute issues much easier.

Basically, we would like to reduce the cases where the users has to paw
around with the path iterators to really unusual situations. And increase
the number of operations that can be done portably by the user.

That's the key rationale for adding path::root() and the query
functions. There is no portable way for the user to write the equivalent
functions.

Thanks for all the ideas,

--Beman


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk