Boost logo

Boost :

From: Gavin Lambert (boost_at_[hidden])
Date: 2021-10-18 01:05:28


On 18/10/2021 13:01, Vinnie Falco wrote:
>>> assert( u.encoded_url() == "https:/.//index.htm" );
>>
>> I assume this was intended to be "https://./index.htm"?
>
> Nope, it was correct as I wrote it. You managed to produce an
> authority with a single dot :)

Yes, that was my intent, from your description of replacing the
authority with a dot.

Though I see the issue now. I was reading the input as:

     "https://example.com/index.htm"

For which removing the authority should result in:

     "https:/index.htm"

(Although this is unusual; typically relative URIs will omit the scheme
as well.)

For the input:

     "https://example.com//index.htm"

Then it does make sense at a purely-URL-level to transform this to:

     "https:/.//index.htm"

(Although most web servers would treat either as illegal, but you could
envisage some not-HTTP protocol that requires such syntax.)

Adding the authority back to this URL should result in
"https://example.com/.//index.htm", however -- it should not be
"ignoring" the prefix once it exists. At least not until the URL is
normalised. (Unless you're documenting that URLs are always stored in
normalised form, or that setters will automatically normalise.)

> We treat a leading "/." as not appearing in the segments, to make the
> behavior of the library doing these syntactic adjustments transparent
> and satisfy the rule that assignments from segments produce the same
> result when iterated.

So what about the input "https://example.com/./index.htm"? Unless
you're documenting automatic normalisation, this should still iterate
the "." and "index.htm" path components separately.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk