Boost logo

Boost :

From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2021-10-18 02:15:11


On Sun, Oct 17, 2021 at 6:05 PM Gavin Lambert via Boost
<boost_at_[hidden]> wrote:
> So what about the input "https://example.com/./index.htm"? Unless
> you're documenting automatic normalisation, this should still iterate
> the "." and "index.htm" path components separately.

Well I wouldn't say that the library is "doing automatic
normalization." It is just a special case treatment of up to 2 leading
characters of the path. As a general rule I think the library needs to
make a best-effort to preserve the components making up the path
iteration. That means, if you have a URL with index.htm and for
whatever reason the library needs to put this prefix "./" in front of
it, then iterating the path should still only give index.htm. And
furthermore that the implementation does not hide any "state"
information outside of the string itself.

If you have "//x//y" and you remove the authority to get ".//y", then
add the authority back to get "//x/.//y" and iterating would produce {
".", "", "", "y" ) then we have changed what iterating the segments
returns without the user asking for it.

So I think the library has to treat up to 2 leading characters of the
path special ("/", "./", and "/."), adding and removing them as needed
to preserve syntactic correctness and semantic equivalence when
performing mutation operations. To ensure the stability of iterated
segments, this means that we shouldn't return that malleable prefix of
the path.

Thanks


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk