Boost logo

Boost :

From: Rainer Deyke (rdeyke_at_[hidden])
Date: 2022-08-21 17:55:24


On 21.08.22 17:13, Vinnie Falco via Boost wrote:
> On Sun, Aug 21, 2022 at 7:15 AM Rainer Deyke via Boost
> <boost_at_[hidden]> wrote:
>> - The lack of IRI support is unfortunate. It's 2022; we should all
>> be writing software with Unicode support by default. However, this can
>> be built on top of Boost.URL, and isn't needed in all cases.
>
> We will probably add something to parse an IRI but in all likelihood
> it would convert it to UTF-8 as a regular URL. I don't know if I have
> the stomach for a total duplication of the existing library except
> names like iri_view, iri, static_iri, the duplication of all the
> segments and params containers, and the modification of all those
> mutating algorithms to support Unicode. I'm not even sure that it is
> called for, given that IRIs are for more user-facing purposes. Such
> things cannot be submitted in HTTP requests, and as far as I know,
> unicode host names would need to be converted to punycode anyway
> (which we could do) to submit them to a DNS server.

I see IRIs not as a different datatype, but as a specific interpretation
of URIs. The transparent percent en-/decoding of Boost.URL already gets
us most of the way there. Additional IRI support would mean:
   - Decoding accessors that perform utf-8 validation. (Arbitrary
percent-encoded 8-bit values are legal in URLs, but not in IRIs.)
   - Additional mutators that perform NFC normalization or validation.
   - Punycode encoding/decoding.

-- 
Rainer Deyke (rainerd_at_[hidden])

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk