Boost logo

Boost :

From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2023-01-28 06:57:03


>On Fri, Jan 27, 2023 at 8:09 PM Klemens Morgenstern via Boost <boost_at_[hidden]> wrote:
> I do like reducing dependencies, but I don't see the point of separating
> things just for the sake of it.

There are benefits to a strategic refactoring of composite libraries
("separating things") but that is not why I am proposing this. The
answers to questions follow but before that I want to provide readers
with a key insight gleaned from field experience.

There is a recurring buffer motif which is a dual, that is:

1. reading from or writing into given buffers
2. providing buffers to be read from or written into

These operations are fundamental, and one cannot be efficiently
expressed in terms of the other. The two forms of this pattern occur
with such frequency across varying domains that there is value in
designing a common interface to express them. More on this later.

> If you're writing a boost library most users will have installed boost as a
> monolith; but even those that didn't will use this in the context of http+asio
> so that an avoiding the inclusion of `asio/buffer.hpp` doesn't really seem like
> a great motivation for a new library to me.

We have libraries right now that implement serialization and parsing
for things such as JSON, HTTP message bodies, deflate or gzip
compression and decompression. Soon we will likely have Boost.Mustache
which applies parameters to a template to produce output. And having a
MIME/forms library is probably inevitable. We want to allow users to
use these things as payloads for incoming or outgoing HTTP requests
and responses.

Currently this is done by adding the support to the library which
implements the HTTP, such as Boost.Requests. Note how Boost.Requests
has to know about JSON and how to serialize or deserialize it:

<https://github.com/CPPAlliance/requests/blob/1214608d19f6a8c020671fcd3507e00ec6d1b3d4/include/boost/requests/json.hpp>

To add support for some new body X, the structural model in use by
Requests as written requires changing the Requests library itself,
because the interface to serialize or parse between X and HTTP
needlessly conflates networking and asio buffers as can be seen here:

<https://github.com/CPPAlliance/requests/blob/1214608d19f6a8c020671fcd3507e00ec6d1b3d4/include/boost/requests/json.hpp#L98>

The maintainer of Requests has to manually add a new asynchronous
algorithm for every HTTP method in order to receive X as a body type.
Here are the signatures for receiving a response containing body X for
the HTTP methods GET, POST, and PUT. There are more methods which need
to be supported but I have not bothered to link them, check the file
to see them:

<https://github.com/CPPAlliance/requests/blob/1214608d19f6a8c020671fcd3507e00ec6d1b3d4/include/boost/requests/json.hpp#L211>
<https://github.com/CPPAlliance/requests/blob/1214608d19f6a8c020671fcd3507e00ec6d1b3d4/include/boost/requests/json.hpp#L244
<https://github.com/CPPAlliance/requests/blob/1214608d19f6a8c020671fcd3507e00ec6d1b3d4/include/boost/requests/json.hpp#L312>

Every time Requests wants to support a new body type, it must add as a
dependency the library containing the type it wishes to support. And
then it must re-implement that support four times for every HTTP
method: two synchronous versions (with and without exceptions) and two
asynchronous versions (with and without exceptions). If
Boost.Http.Proto and Beast used this model, they too would have to
depend on every library which has a candidate body, and duplicate the
serialization and deserialization algorithms (called BodyReader and
BodyWriter in Beast).

This is clearly unsustainable.

I propose a better solution which eliminates almost all of the
overloads by refactoring to separate concerns and allows the
algorithms to be expressed once without introducing multiple
dependencies.

Instead of teaching Requests how to serialize and deserialize every X
it wants to support, Boost.Buffers provides asio-agnostic concepts
which allow the library containing X itself to provide the
serialization and deserialization algorithms. Boost.JSON might add
this declaration allowing a json::value to be serialized using
Boost.Buffers:

    #include <boost/json/serializer.hpp>
    #include <boost/buffers/source.hpp>

    namespace boost {
    namespace json {

    struct json_source : buffers::source
    {
        // JSON value to serialize
        value const& jv;

    private:
        buffers::source::results
        read( buffers::mutable_buffers_pair ) override;

        serializer sr_;
    };

    } // json
    } // boost

With this in hand, Requests needs only to implement serialization of
the HTTP body from a buffers::source, to be able to serialize JSON
values. And it accomplishes this by adding only Boost.Buffers as a
dependency, without knowing about Boost.JSON. Furthermore, every
subsequent library that adds an implementation of Source for one or
more of its data types will now become serializable by Requests,
without Requests having to depend on that library and without that
library having to depend on Requests. This works because Boost.Buffers
provides a common set of vocabulary types and concepts.

> I for one don't mind using sub-libraries, e.g. including boost/core for the
> buffer types seems ok to me.

Having asio-agnostic buffers in core is certainly better than not
having them at all, but having them in their own library is better
still. It gets its own library-specific documentation. It runs CI jobs
faster and consumes less overall Drone resources over time. It becomes
more discoverable by users. We can follow the principle of one type or
concern per header file instead of cramming everything into a single
buffer.hpp file. This lets downstream libraries decide which headers
they actually need, to cut down on compile times. We can add useful
buffer and dynamic buffer implementations such as buffers_pair,
circular_buffer, flat_buffer, and the perennial favorite
array_of_buffers (supporting sizes of up to one million or more).

Additionally the library can be evolved independently of the Asio
author which in simple terms means that open issues will receive
immediate attention, emails will get answers, and the maintainers can
be reached on Slack. As you said, "most users will have installed
boost as a monolith." This means there is no penalty for its existence
as its own Boost library. But there are the benefits stated above.

These benefits are not theoretical; Requests aims to add forms, files,
and various other things. When these are expressed in terms of
Boost.Buffers, they will automatically work in Requests without the
need to add support for it specifically. But there is another huge
benefit, every library which adds support for parsing and/or
serialization in terms of Boost.Buffer concepts can now be used in
Boost.Http.Proto with no modification or changing of library
dependencies!

At this point, someone like Peter might object. He might say, well
Boost.JSON doesn't care about sources and sinks it only cares about
JSON. Or worse he might become dismissive if I ask him to support
Boost.Buffers in the upcoming Boost.Mustache. But what is the purpose
of having this giant monolith of libraries, if not to take advantage
of our ability to make all the libraries work together seamlessly with
the other libraries, and enhance the value of the collection as a
whole?

> What I would however like is a unification of the two dynamic_buffer
> concepts, which are currently a mess. Taking the beast buffers and
> making them work nicely with asio would be great.

This is a good goal, but it is orthogonal to what I am proposing here.
Boost.Buffers does not use Asio, does not depend on Asio, does not
require its downstream libraries to use Asio, and does not replace
Asio for libraries that need to perform networking. The fact that it
uses concepts that are astoundingly similar to Asio merely
demonstrates that the Asio author discovered very good general purpose
abstractions.

> Thus a container type for a chunk of raw memory should also be included.
> I.e. something akin to a vector<void> if you will.

Hmm... I do not believe std::vector<void> can be made to work but we
can certainly do std::vector<unsigned char> (and we should).

Thanks


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk