Boost logo

Boost :

From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2021-10-16 18:53:45


On Sat, Oct 16, 2021 at 11:10 AM Phil Endecott via Boost
<boost_at_[hidden]> wrote:
> I would be very interested to see a comparison of how your
> decomposition of paths into segments, and the reverse, compares
> to what std::filesystem::path does, and rationale for the
> differences.

To be honest I have no idea, I have never used std::filesystem to any
meaningful extent.

> > abs("/././/", { ".", "", "" });
>
> Well that's an interesting one. Why is the middle element "" not
> "."? Can you clarify?

So yeah there are some basic rules. For example if you call

    u.encoded_segments() = { ... }; // initializer_list<string_view>

Then the result of

    for( auto it : u.encoded_segments() )
        ...

Should yield the same list of strings used in the initializer list. If
we accept that a path of "" is an empty relative path, and a path of
"/" is an empty absolute path, then iterating either of those paths
should produce the empty set: {}

A goal of the library is to guarantee that all mutations leave the URL
in a syntactically valid state. Consider the following sequence of
operations:

    url u = parse_uri( "https://example.com//index.htm" ).value();

    u.remove_authority();

What should the resulting encoded URL be? Well a naive algorithm might
leave us with "https://index.htm" but this is clearly wrong, as
index.htm is now the host! To fix this, the library prepends "/." to
the path (this is guidance from rfc3986):

    assert( u.encoded_url() == "https:/.//index.htm" );

There are a handful of other cases like this (for example, removing
the scheme from a relative-ref whose first segment has a colon).
Coming back to:

    abs("/././/", { ".", "", "" });

We treat a leading "/." as not appearing in the segments, to make the
behavior of the library doing these syntactic adjustments transparent
and satisfy the rule that assignments from segments produce the same
result when iterated.

> I might perhaps hope that something like this would work:
>
> for (auto dir: path.segments) {
> chdir(dir.c_str());
> }

I believe it will work for the cases where it should work, although I
can't say with certainty until I have finished making these changes.

> I believe that std::filesystem::path merges adjacent directory separators,
> so the "" segments disappear.

Yep, that needs to be a supported operation but it should not be
called push_back or insert, because those operations which have the
same names as their vector equivalent, should behave exactly like
vector. And vector<string> doesn't delete a trailing "" when you
push_back a non-empty string. I would have to add more functions such
as (bikeshedding aside) merge_back or merge_at.

Thanks


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk