|
Boost : |
From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2021-10-13 01:06:02
On Tue, Oct 12, 2021 at 2:52 PM Alex Christensen <achristensen_at_[hidden]> wrote:
> It is perfectly valid input that some URL libraries I work with accept and
> percent encode, and some URL libraries I work with reject it as an invalid
> URL. I think itâs a valid URL parser input that ought to produce a valid URL,
> but not everyone agrees on this yet.
Not so fast, I think that this can be decided objectively.
A URL in the context of Boost.URL refers to "URI" in the rfc3986
sense. I use URL because most people never heard of URI.
What you are thinking of as a "valid URL parser input" is actually an
Internationalized Resource Identifier, which supports the broader
universal character set instead of just ASCII and is abbreviated by
the even more obscure acronym "IRI." It is covered by rfc3987:
<https://datatracker.ietf.org/doc/html/rfc3987>
Translating your comment, I think you're saying "Boost.URL should
support Internationalized Resource Identifiers." That is unfortunately
out of scope for the library, as Boost.URL is mostly designed for the
exchange of URLs between machines or programs and not necessarily for
display to users. Perhaps someday, the entire world will have switched
to IRIs (maybe after IPv4 is no longer in use) but we are not there
yet, and most systems require IRIs to be mapped to their URI
equivalent:
<https://datatracker.ietf.org/doc/html/rfc3987#section-3>
There is some value to IRIs but not as much as there is for the ASCII
URLs, which fill a tremendous user need (HTTP/WebSocket clients and
servers using Beast or Asio).
Thanks
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk