Boost logo

Boost :

From: Carlo Wood (carlo_at_[hidden])
Date: 2004-08-21 15:52:30


On Sat, Aug 21, 2004 at 11:36:31AM +0100, John Maddock wrote:
> > I propose the following design. The aim of boost::filesystem should be to
> > support the following coding idiom:
> >
> > * The programmer should take care to only handle two types of paths
> > in his application:
> >
> > 1) Complete paths
> > 2) Relative paths
>
> That is true now.

No, the current implementation/design doesn't restrict anything at all
in this regard. You can just use fs::path to store paths - and that is
it. A path is not aware of a distinct difference between complete
paths and relative paths (except that they are complete or relative
of course) - definitely not in the same way as a design would that
uses two different classes for the two.

The most problematic difference is that it is possible to even store
a third possibility (one that is neither complete nor relative).
This is what caused me to see that there is a design error in the first
place and caused me to think about improvements.

> > * The programmer will have to specifically tell the libary when a
> constructed
> > path is 'native' and when not. A native path is accepted according to
> native
> > rules and never gives any problem (exceptions) further on.
> > A non-native path is checked according to the existing rules, which
> basically
> > means that the programmer can set a default check routine that will in
> effect
> > determine how portable the application will be.
>
> That is also true now, isn't it?

No, because the 'native' that you can specify with the current design
is only a check on the characters used in the directory components, and
only related to the representation of a path - not *marking* the path
as different. The 'native' I am talking about is enforced for complete
paths and reflexs also the root part of a directory ("C:\", "c:/cywgin/",
"/" etc). The essential difference is that a path that is once marked
as 'native' must stay native. It can never be converted to a portable
path anymore. Only relative paths that are portable from the start
can stay portable after certain operations (like appending a directory, or
cutting off a directory at the end).

> > I propose two design changes:
> >
> > 1) 'native' is now not only a representation, but an *internal state* of
> fs::path.
> > (this has no effect on the representation as returned by
> fs::path::string()).
> > 2) All 'complete' paths are automatically marked as 'native'.
>
> If I understand you correctly, you are suggesting that error checking is
> turned off for native paths, I would support that, but other than that I
> don't see how it differs.

Heh - the current design does NOT have an internal state that marks
a path as native or not. Hence, it cannot do sensible error checks
that need that knowledge. Isn't that difference enough?

> Actually, there is another case where error checking needs to be turned off:
> when the path is obtained from a directory_iterator, but is none the less
> relative (and *please* don't tell me that all such paths should be complete,
> that would break a lot of code; actually it would make a lot of coding
> idioms impossible).

A directory_iterator returns a single directory component no? Apart from
that that is almost rather a string, it is at most a relative path: the
root part is not there. I have absolutely no problem when the root part
of simply forgotten; it will be easy enough to add it back (and to detect
this error when one forgets it).

I can see a problem with iterating over a native_path though. I'd vouch
for only allowing to iterate over relative_path components.

For example:

[...] sorry, but at this point my eye fell on:

> > Examples, the following code is legal:
> >
> > fs::path p1("C:\\foo\\a.exe", native); // As one might do on windows.
> > fs::path p2("/usr/src/a.out, native); // As one might do on linux.
> > std::cout << p1.native_file_string(); // ok, p1 is native.
> >
> > fs::path p3("foo/bar"); // Relative path, always succeeds.
> > std::cout << p.string(); // ok
> > std::cout << p.native_directory_string(); // Useless, but ok.
>
> Not useless at all. And works now.
  ^^^^^^^^^^^^^^^^
    \__ this.

That is not backupped.
We can't have a discussion along the lines of "yes" "no" "yes" "no".
Please try to be a bit more constructive. *sets the example verbosity
a few notches back*.

[...] anyway... it is a problem thus when you allow to iterator over
a native_path and have that produce relative_path's. I wouldn't allow that.

> > fs::path p4(complete(p3)); // p4 is now "native", because now it is
> complete.
> > std::cout << p4.native_directory_string(); // ok
> >
> > And the following will fail (assertion?):
> >
> > std::cout << p4.string() // Not allowed because p4 is native (complete).
>
> One could add an assertion that the path is not complete, if you want that
> behaviour.

Yes. Throwing an exception seem not correct. An assertion makes most sense.

> Actually this change would break the bcp utility - there is a (slightly
> hairy) use for this.

Would you mind to defend that use? I cannot think of a use that is in fact
not portable (in a dangerous way) and cannot be solved better.

> > Since there would be a default way defined of how a relative path is
> > completed, all operation functions will accept both, relative
> > and complete paths. For example:
> >
> > fs::path p1("C:\\cygwin\\usr/bin/ls", native); // Legal path on Cygwin.
> > if (fs::exists(p1)) // Ok, access complete path.
> >
> > // For clarification
> > fs::path p2("C:\\cygwin\\usr"); // Just an example
> > fs::default_working_directory(p2); // fictuous function.
> >
> > fs::path p3("bin/ls"); // Portable representation (refers in fact to
> "C:\cygwin\usr\bin\ls.exe").
> > if (fs::exists(p3)) // Ok : exists() will make the path complete before
> testing.
> >
> > And this throws:
> >
> > fs::path p4("/bin/ls"); // Not allowed: this path has a root but is not
> marked 'native'.
> >
> > Setting the default_working_directory shall allways
> > need to be done for each supported OS seperately.
> > Of course you can set it to "/" on single root machines, and
> > set it to "E:/" after extracting the 'E:' from the current
> > path at application start up, in effect simulating a 'single root':
>
> You should never need to set that explicitly unless you want to: each
> aplication inherits a default working directory anyway from the host
> environment.

Yes - it should be set by default to the working directory at application
start (first use of fs::path, or however it works now). But no harm
done to allow it to be changed explicitely. I am just saying that this
default working directory should NOT change automagically as function
of the _real_ working directory. I think this is the case now too, no?

> BTW the behaviour you're asking for was required by bcp - all the paths are
> relative to some root (the boost installation path) - that path may be
> relative or absolute; and whenever you need a path relative to some root,
> one can just use:
>
> my_root / my_relative_path
>
> so again, you can do what you want right now.

I think I can implement my proposal on top of the current boost::filesystem yes.
But that doesn't mean that boost::filesystem is robust, or supports coding
in a portable way.

> > The average application will work with relative
> > paths, relative to some (native) base directory
> > and next to that have some arbitrary, complete and thus native
> > directories (ie, read from environment variables).
> >
> > But in case more than one 'working' directory seems needed
> > then we can add support for that too by allowing to
> > construct paths with a reference to the (complete/native)
> > working directory. Ie,
> >
> > fs::path homedir(g_getenv("HOME"), native);
> > fs::path rcdir(homedir / "edragon/rc", native);
> > fs::path tmpdir(current_path().root_path() / "tmp");
> > // ...
> >
> > fs::path runtime_rcfile(rcdir); // Set 'runtime_rcfile' to be relative
> to 'rcdir'.
> > // ...
> >
> > runtime_rcfile = "config/runtimerc";
> >
> > which is then relative to `rcdir' instead of
> > a single, global 'working directory' (as now returned
> > by fs::current_path()).
>
> I'm sorry, but that looks way more complicated to me than the current
> design: if you want a path to be relative to a specific base, then use
> "my_base/my_path", it's easy to use, works, and it's clear what you mean as
> well.

Right now you can call all the operations that actually pass a path to some
system call - and UNLESS that path is complete - it will not be non-portable.
THAT means that boost::filesystem has a structural design error. You are
now saying that I can complete all paths just prior to calling any operation
that passes paths to system calls. Would you mind even listing those for me?
Why not change boost::filesystem so that it takes care of completion
automatically when needed? That way you cannot make errors.

As an example - I developed my application first on GNU/Linux with as only
demand that it had to be portable to cygwin. Therefore, I used UNIX paths.
Testing if /usr/src/edragon/edragon/build/src/gui/gui.la existed on linux
worked fine. It did NOT work on cygwin. Because I was stupid? No, I think
that was because boost::filesystem is not activily supporting portable programming.

-- 
Carlo Wood <carlo_at_[hidden]>

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk