Boost logo

Boost :

Subject: Re: [boost] boost::filesystem::path frustration
From: Neil Groves (neil_at_[hidden])
Date: 2013-01-25 12:20:37


On Fri, Jan 25, 2013 at 4:30 PM, Beman Dawes <bdawes_at_[hidden]> wrote:

> On Thu, Jan 24, 2013 at 8:56 PM, Dave Abrahams <dave_at_[hidden]> wrote:
> >
> > I'm finding that boost::filesystem::path seems to be a strange mix of
> > different beasts, unlike any entity we have in the STL. For example,
> > when you construct it from a pair of iterators, they're expected to be
> > iterators over characters, but when you iterate over the path itself,
> > you are iterating over strings of some kind (**). Even though, once
> > constructed, this thing acts sort of like a container, it supports none
> > of the usual container mutators (e.g. push_back, pop_back, erase) or
> > even queries (e.g. size()), making it incompatible with generic
> > algorithms and adaptors.
>
> It isn't really a container, but it is convenient to supply iterators
> over the elements of the contained path. Should more container-like
> mutators be supplied? I'm neutral - they would occasionally be useful,
> but add more signatures to an already fat interface.
>
>
Perhaps a path could have an interface analagous to
std::vector<std::string>, even if the implementation is optimised somewhat
to keep the commonly accessed string representation as the underlying
storage. Perhaps random access would be a bit daft, but it does seem
reasonable to converge the interface. Additionally it might help define a
new Concept that it a subset of a Container to assist with Dave's goal of
maximising reuse within other algorithms.

> > In particular, this comes up because I'm trying to find the greatest
> > common prefix of two paths. I thought this would be easy; I'd just use
> > std::mismatch. But even once I've found the mismatch I don't see any
> > obvious way to chop off the non-matching parts of one of the paths. I
> > end up having to resort to some really ugly code (or I just haven't
> > figured out how to use this thing correctly).
>
>
I wonder if this is *really* what you want! I suspect that you probably
want to determine the common effective prefix of the paths after
canonicalisation.

For illustration:
I suspect that the result you want from fn("/usr/sbin/../bin/test1.txt",
"/usr/bin/test2.txt") is "/usr/bin" rather than "/usr".

The inclusion or exclusion of links is less obvious. My experience is that
for the most-part I simply want the absolute canonical representation to be
considered.

Not particularly elegant, but this does work:
>
> path x("/foo/bar");
> path y("/foo/baar");
>
> auto result = std::mismatch(x.begin(), x.end(), y.begin());
>
> path prefix;
> for (auto itr = x.begin(); itr != result.first; ++itr)
> prefix /= *itr;
>
> std::cout << prefix << std::endl;
>
>
I think this code doesn't "work" because it meets the stated requirements
exactly! I think the requirements are normally greater than those we first
think of when looking at the problem.

> > Why should paths be so different from everything else? I think, if the
> > design is actually right, some rationale is sorely needed.
> >
>
> Also,
> >
> > * (**) the docs don't say what the value_type of path::iterator is. A
> > string value? A range that becomes invalid when the path is
> > destroyed? Ah!?! How surprising; inspecting the code shows it
> > iterates over paths! A container whose element type is itself is very
> > unusual!
>
> It is a kludge to deal with the type of the contained string being
> implementation defined and not necessarily the type the user wants. In
> other words, a misuse of path to supply string interoperability. The
> returned type should ideally be a basic_string, with begin() and end()
> templatized on the string details, but I didn't think of that until
> recently.
>
> > * the docs claim you can construct a path from a "A C-array. The value
> > type is required to be char, wchar_t, char16_t, or char32_t", but
> > doesn't say how that array will be interpreted. From the wording I
> > might have assumed it accepts a CharT(&)[N] and the length of the
> > input is taken as N, but inspecting the code shows it expects a CharT*
> > and interprets the source as null-terminated.
>
> I'll make some doc changes per your comments above.
>
>
The addition of a make_relative_path function has been discussed and code
to provide the relative path from the canonical formats has been submitted
previously see:
http://stackoverflow.com/questions/10167382/boostfilesystem-get-relative-path

It looks to be a very valuable feature even if it the implementation
requires adjustment.

> --Beman
>
>
With all that stated, I have found the recent versions of Boost.Filesystem
to support my use-cases elegantly and without issue. Indeed it frequently
offers superior solutions that are much better considered and though
through than those provided by many scripting languages. Obviously while
most of my communication has been about what I would like to see done
differently I am a grateful user of this library. Thank you for your hard
work.

Regards,
Neil Groves


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk