Boost logo

Boost :

From: Andrzej Krzemienski (akrzemi1_at_[hidden])
Date: 2022-06-04 21:22:33


Hi Everyone,
I am trying to do a review of Boost.URI library (I usually find the
official ten-day review period to be too short), and there are a number of
interesting things I came across that I thought I would mention.

The docs say:

> The library requires Boost and a compiler supporting at least C++11.
>

Even though the library is a candidate for a Boost library, I understand
that it is now offered as a "stand alone" version, and this is what the
docs describe. I tried to use it with the latest MinGW Distro on Windows (
https://nuwen.net/mingw.html), which uses GCC 11.2 and Boost 1.77. Without
success. This is because Boost.URL relies on the component
boost::system::result<T>, which is present in Boost.System only since
version 1.78:
https://www.boost.org/doc/libs/1_79_0/libs/system/doc/html/system.html#changes_in_boost_1_78

First, this is news to me that we have `result<T>` in Boost.System, which
has an overlap with result<T, E> from Boost.Outcome. Second, I recommend
that Boost.URL docs say that it requires Boost 1.78 or higher.

Next, we read:

Aliases for standard types, such as string_view
> <https://master.url.cpp.al/url/ref/boost__urls__string_view.html>, use
> their Boost equivalents.
>

After reading this, I expected that Boost.URL would use boost::string_view
from Boost.Utility library:
https://www.boost.org/doc/libs/1_79_0/libs/utility/doc/html/utility/utilities/string_view.html

But instead, it uses boost::core::string_view, which is an implementation
detail from Boost.Core library:
https://github.com/CPPAlliance/url/blob/master/include/boost/url/string_view.hpp

Again, this is news for me that Boost has two implementations of
string_view. Why? Second, I do not think that Boost.URL should rely on the
implementation details of Boost.Core. A better alternative would be to use
the official boost::string_view from Boost.Utility. Or is there a good
reason not to?

Next, the section on the parsers (
https://master.url.cpp.al/url/parsing/url.html) describes the function
parse_uri() which returns result<url_view>. What strikes me is this
difference: URI (Identifier) in the function name, and URL (Locator) in the
return type. I always used the terms URL and URI interchangeably. But now
that I see them used in this way in a well designed library, it looks
disturbing. The quoted rfc3986 (
https://datatracker.ietf.org/doc/html/rfc3986#section-1.1.3) says that an
URL is a subset of URI. Now, the name `parse_uri` implies that it will
recognize any URI, but on the other hand it is impossible that the result
will fit into a url_view, because not every URI is an URL.
The synopsis for parse_uri (
https://master.url.cpp.al/url/ref/boost__urls__parse_uri.html) says:

Exception safety: throws nothing.
>
And the line below it says that the function throws std::length_error when
the input is too long. It looks like a bug in specs. Later we read:

Return value: A result containing the view to the URL, or an error code if
> the parsing was unsuccessful.
>

Which is not precise enough to give me the answer to the URI-vs-URL
question. When can a parsing be non-successful? Is it only because it was
not conformant to the grammar? The synopsis says "This function parses a
string according to the URI grammar below", but is it a URI grammar or a
URL grammar actually?
Maybe the "return value" section should say instead:

Return value: A result containing the view to the URL, or an error code if
> the contents of `s` were not conformant with the above grammar.
>

That is, any other reason for not being successful (if any resources needed
to be allocated and failed) may still be reported via exceptions.

Now, there is probably a good explanation to the URI vs URL discrepancy. I
think it would be good if it was placed in the docs, so that the users
don't get confused.

While this might look like a list of complaints, I really appreciate the
efforts the authors put in creating this library and its documentation. The
documentation is really high quality, way higher than the average you will
find in GitHub. And this is actually because of this high quality that I am
able to spot and report these issues.

Regards,
&rzej;


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk