Boost logo

Boost :

From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2021-10-12 18:17:18


On Tue, Oct 12, 2021 at 10:57 AM Alex Christensen
<achristensen_at_[hidden]> wrote:
> Why did you use RFC 3986 as your specification?

Well, this one seems to be the latest RFC regarding URLs, modulo a bit
of HTTP-specific stuff like authority-form appearing in rfc7230. Is
there something newer?

> How do you feel about the WhatWG URL specification at https://url.spec.whatwg.org?

Quite frankly, I hate it. This "specification" manages to organize and
present the information in the most obtuse manner possible. I find it
hostile to implementers like myself.

> It has a large body of tests ...that you may consider looking at.

Yep, it does! Integrating them is on my to-do- list:

<https://github.com/CPPAlliance/url/tree/c6c4b433c3b1057161b6ce50bb4fba0b5f59b4ee/test/wpt>

> Do you have any general plan for a strategy for handling non-ASCII input?

Yes, the plan is to reject such input. Strings containing non-ASCII
characters are not valid URLs. And even some ASCII characters are not
allowed to appear in a URL, for example all control characters.

> I haven’t tested it yet, but what do you plan to do if someone passes a UTF-8
> encoded non-ASCII string...

I think what you're asking is, what if someone supplies a URL which
has escaped characters which, when percent-decoding is applied, become
valid UTF-8 code point sequences? That's perfectly fine.
Percent-encoded URL parts are in fact "ASCII strings."

> ...into the constructor?
You can't construct a URL from a string, you have to go through one of
the parsing functions. This is because the library recognizes several
variations of URL grammar, and does not favor any particular grammar
by choosing one to support construction. See Table 1.1:

<https://master.url.cpp.al/url/parsing.html>

Thanks


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk