Boost logo

Boost :

Subject: Re: [boost] Boost.HTTPKit, a new library from the makers of Beast!
From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2017-10-12 12:45:52


On Wed, Oct 11, 2017 at 9:47 PM, Vinícius dos Santos Oliveira
<vini.ipsmaker_at_[hidden]> wrote:
> I took the liberty to convert the Tufão project that you've
> mentioned to use the Boost.Beast parser:
> https://github.com/vinipsmaker/tufao/commit/56e27d3b77d617ad1b4aea377f993592bc2c0d77
>
> Would you say you like the new result better?

It seems pretty reasonable to me.

> Would you say that I've misused your parser to favour my approach?
> How would you have done it in this case?

Misused? I don't think so. The only meaningful change I would make is
that I would have simply called basic_parser::is_keep_alive() instead
of re-implementing the logic for interpreting the Connection header.

> Would you go beyond and accept the idea that the spaghetti effect
> is inherent to the callback-based approach of push parsers?

This is where we are venturing into the world of opinion. It seems
like you have a general aversion to callbacks. But there is a reason
Beast's parser is written this way. Recognize that there are two
primary consumers of the parser:

1. Stream algorithms such as beast::http::read_some
2. Consumers of structured HTTP elements (e.g. fields)

The Beast design separates these concerns. Public member functions of
`basic_parser` provide the interface needed for stream algorithms,
while calls to the derived class provide the structured HTTP elements.
I don't think it is a good idea to combine these into one interface,
which you have done in your parser. The reason is that this
unnecessary coupling pointlessly complicates the writing of the stream
algorithm. Anyone who wants to write an algorithm to feed the parser
from some source of incoming bytes now has to care about tokens. This
is evident from your documentation:

<http://boostgsoc14.github.io/boost.http/#parsing_tutorial1>

In your example you declare a class `my_socket_consumer`. It has a
single function `on_socket_callback` which is called repeatedly with
incoming data. Not shown in your example is the stream algorithm (the
function which interacts with the actual socket to retrieve the data).
However, we know that this stream algorithm must be aware of the
concrete type `my_socket_consumer` and that it needs to call
`on_socket_callback` with an `asio::buffer`. A signature for this
stream algorithm might look like this:

    template<class SyncReadStream>
    void read(SyncReadStream& stream, my_socket_consumer& consumer);

Observe that this stream algorithm can only ever work with that
specific consumer type. In your example, `my_socket_consumer` handles
HTTP requests. Therefore, this stream algorithm can now only handle
HTTP requests. In order to receive a response, a new stream algorithm
must be written. Compare this to the equivalent signature of a Beast
styled stream algorithm:

    template<class SyncReadStream, bool isRequest, class Derived>
    void read(SyncReadStream& stream, basic_parser<isRequest, Derived>& parser);

This allows an author to create a stream algorithm which works not
just for requests which store their data as data members in a class
(`my_socket_consumer`) but for any parser, thanks to the CRTP design.
For example, if I create a parser by subclassing
`beast::http::basic_parser` with an implementation that discards
headers I don't care about, then it will work with the stream
algorithm described above without requiring modification to that
algorithm. It is interesting to note that your `my_socket_consumer` is
roughly equivalent to the beast::http::parser class (which is derived
from beast::http::basic_parser):

<https://github.com/boostorg/beast/blob/f09b2d3e1c9d383e5d0f57b1bf889568cf27c39f/include/boost/beast/http/parser.hpp#L45>

Both of these classes store incoming structured HTTP elements in a
container for holding HTTP message data. However note that unlike
`beast::http::parser`, `my_socket_consumer` also has to know about
buffers:

    void on_socket_callback(asio::buffer data)
    {
        ....
        buffer.push_back(data);
        request_reader.set_buffer(buffer);

It might not be evident to casual readers but the implementation of
`my_socket_consumer` has to know that the parser needs the serialized
version of the message to be entirely contained in a single contiguous
buffer. In my opinion this is a design flaw because it does not
enforce a separation of concerns. The handling of structured HTTP
elements should not concern itself with the need to assemble the
incoming message into a single contiguous buffer; that responsibility
lies with the stream algorithm.

The design decision in Beast is to keep the interfaces used by stream
algorithms separate from the interface used by consumers of HTTP
tokens. Furthermore the design creates a standard interface so that
stream algorithms can work with any instance of `basic_parser`,
including both requests and responses, and for any user-defined
derived class.

Thanks


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk