Boost logo

Boost :

From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2020-01-22 08:59:25


On 2020-01-22 04:10, Gavin Lambert via Boost wrote:
> On 22/01/2020 07:39, Andrey Semashev wrote:
>> On 2020-01-21 18:51, Vinnie Falco wrote:
>>> On Tue, Jan 21, 2020 at 2:13 AM Andrey Semashev wrote:
>>>> I'd be more interested in a more generic URI library.
>>>> Along with a few associated algorithms, e.g. those described in:
>>>> https://tools.ietf.org/html/rfc3986
>>>
>>> Yes, this library does that. I do not use the term "URI" because it is
>>> confusing and pointless. They are all URLs now. My library follows the
>>> RFC, except that I have renamed the top level production rules to
>>> reflect this preference:
>>>
>>>     URL           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
>>>     URL-reference = URL / relative-ref
>>>     absolute-URL  = scheme ":" hier-part [ "?" query ]
>>>
>>> I didn't invent this idea, deprecating the word "URI" and using "URL"
>>> consistently in its place is recommended by WhatWG.
>>
>> There is a semantic difference between URI and URL - the former is an
>> identifier and the latter is a locator (i.e. a path to a resource
>> location). You can treat locator as an identifier but not the other
>> way around. Using the term URL to refer to an URI is confusing.
>
> Notably, all URLs are URIs, but not all URIs are URLs.  Some are URNs,
> for example, which are structured a bit differently (eg.
> "urn:oasis:names:specification:docbook:dtd:xml:4.1.2").
>
> A program only dealing with "locations to download from" generally only
> needs to worry about URLs, but there are other places where all URIs
> (including URNs) may be encountered (even by such a program) -- for
> example, as XML namespace identifiers.  (Usually these can be treated as
> opaque, though.)
>
> Still, given that the same parsing rules can apply to both (URNs usually
> just have a long opaque path after the "urn" scheme), it doesn't seem
> unreasonable to call it an "URL library" anyway (despite the
> recommendation in RFC3986).  Some people would be confused by calling
> them "URIs" and those who know better will know that as well.  Having
> said that, the docs should call out RFC support and URI compatibility
> explicitly, so that people aren't left wondering.

 From https://tools.ietf.org/html/rfc8141:

    A Uniform Resource Name (URN) is a Uniform Resource Identifier (URI)
    that is assigned under the "urn" URI scheme and a particular URN
    namespace, with the intent that the URN will be a persistent,
    location-independent resource identifier.

So the name URI is very much appropriate when working with URNs. As is
with URLs. But URL definitely is not the appropriate term to work with URNs.

"People will understand what you mean" is not the right reasoning. As a
programmer, you have every opportunity to pick the right name for the
entity of your code, so that a technically educated reader understands
what this entity represents. People who aren't programmers or do not
know even the basic terms in your technical domain are not your
audience. Personally, I wouldn't be using a `url` type to represent URIs
for the documentation purpose alone.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk