Boost logo

Boost :

Subject: Re: [boost] [asio-users] [http] Formal review of Boost.Http
From: Vinícius dos Santos Oliveira (vini.ipsmaker_at_[hidden])
Date: 2015-08-27 22:40:33


2015-08-25 18:44 GMT-03:00 Lee Clagett <forum_at_[hidden]>:

> What use cases does a container provide over an ASIO buffer?
>

The user may implement a Message concept that will just drop the body
entirely. To be fair, this violates the concept I used, but that was one of
the expectations I had when I was designing the class. I should update the
requirements to better reflect the state...

Are there other advantages or special use cases for the container-based
> system?
>

It separates the responsibility between pass HTTP messages and produce HTTP
messages more easily, which is great for multiple backends support. There
is the embedded server backend, for instance, which produce HTTP messages
and feed them to the user, which consumes them.

Have you considered standalone functions for writing/reading standard
> HTTP/1.x chunked data directly from an ASIO stream?
>

I have thought about it, but I never considered this approach, because I'm
aiming for multiple backends support. But I thought about specializing the
concepts in such a way that you could move the HTTP parser to the HTTP
message object (that's why headers and others are member-functions, not
member variables, so you can have a large blob of memory that has all the
info that is shared by all functions).

There might be ways to allow effective usage of the native handle for
> advanced users (thus keeping the default container implementation).
>

>From the feedback I had, advanced users fall into three categories:

   - They don't care about multiple backends, so an HTTP parser is the only
   abstraction they will approve.
   - They don't know HTTP well enough and had suggested me to *drop* HTTP
   features, so the design would *evolve*, which is very contradictory for an
   *HTTP library*.
   - They had awesome ideas and gave me excellent feedback that I'll use to
   improve the library design.

Anyway, for the "allow effective usage of native handle", I'll just provide
an HTTP parser, as they probably don't care about multiple backend.

- I mentioned this before; client and server specific header
>> > writing/reading is a shame due to the overwhelming similarities. My
>> earlier
>> > suggestions were criticized for being stringly typed (or having more
>> > states), but I think this was a somewhat weak criticism since the type
>> > system can be used more effectively.
>> > `async_write_header(http::response{200, "OK"}, ...)` or
>> > `async_write_header(http::request{http::request::GET, "/"}, ...)`.
>> >
>>
>> This design would have much more implicit behaviour and it'd lead to more
>> surprises. A ResponseBuilder or alike could be done, separately, in case
>> you don't mind the surprises.
>>
>
> What do you mean by implicit behavior? And what surprises? The name of the
> type indicates the action being taken ... ?
>

Individually write each header means you either is buffering all headers
and waiting til the first chunk of body is issued or you're sending them as
you receive them. This is already implicit behaviour, but there is more.

Details about the body delivery must be present in the header section of
the message, but the write_header API don't expose user intent about
whether he is willing to create an atomic message or a chunked message and
workarounds about this bad API must be done to avoid errors like the header
section not being sent while the body isn't issued (which is problematic in
case of an event system and could be solved with a flush_headers), a
0-sized chunked message being issued for a 204 reply, and maybe more.

There are many more code paths possible that needed to be documented and
are tricky to use correctly.

If a user want the write_header API, I'd be on top of current API as a
separate abstraction (ResponseBuilder?), as it'd be more clear what would
happen and will only put risk on users willing to take it.

- The current `async_read_some` implementation has an error code path that
>> > can initiate a write. This is not documented anywhere, and effectively
>> > issuing an `async_read_some` command blocks both the send and receive
>> > channels of the underlying ASIO stream. This problem is likely to occur
>> in
>> > a few other spots, but I didn't check.
>> >
>>
>> The ServerSocket concept gives guarantees about a producer to HTTP
>> messages
>> (and you, as the user, writes the HTTP consumer). A channel to exchange
>> HTTP messages somehow. There were no guarantees about exposing the
>> implementation detail errors to the user.
>>
>> Imagine a ZeroMQ-based ServerSocket that cannot send "pings" because it
>> was
>> not documented in the general concept. The general ServerSocket concept
>> doesn't restrict the freedom from HTTP "providers" too much.
>>
>>
> The general concept needs to stipulate whether reads or writes can be made
> by the client after initiating an asynchronous operation. This is the most
> important thing that must be documented, because the current implementation
> cannot safely accept a write operation when some (all?) of the read
> operations are still in progress. I think its acceptable for an
> implementation to inject control messages under such restrictions, as long
> as the user can still issue their own write when the specification allows
> for it. Such implementations _may_ have to indicate the control message
> behavior when documenting access to the "native handle" (and I suspect
> "may" is going to be "always" in practice), in case a user tries to do some
> raw writing. Or maybe the documentation just says "never do raw
> read/writes" with the native handle.
>

A model of the general concept may not even use a socket. Therefore, the
general concept cannot assume a read will be issued at all and no further
assumptions about them.

This documentation belongs to specific models.

And the write code path you mention is like when the socket has become
useless because non-conformant HTTP traffic was detected.

However, I agree that documentation needs improvement here too.

Anyway, this is a serious documentation error within the library and I need
>> to fix it.
>>
>>
> This isn't just a serious documentation error, its a design problem for
> delivering lower-latency responses. Since the read call can issue an
> automatic error message on the write channel, users of the library cannot
> begin reading the next request (which is likely already in a kernel buffer
> if the client is pipelining requests) until the prior response has finished
> sending. If the errors were not automatically written by the
> implementation, a user could began reading the next request (potentially
> even preparing DB calls, etc.) while waiting for the response of the prior
> request to complete.
>

Okay.

I do not think this is a fatal issue, but it does make me question the
> low-level design goals. The design/implementation seems to be halfway
> between a constricting easy-to-use high-level design and a performant
> low-level design. It seems reasonable to expect users of the low-level
> interface to have more understanding of HTTP, while the high-level design
> provides a better "safety net" for users. The low-level users would still
> benefit from an HTTP system that reads/writes headers into/from an
> associative container, handles receiving/sending chunked data
> automatically, etc., which is convenient.
>

I'm actually aiming for high-performance high-level design and I expect to
see these same abstractions involved in a higher level layer.

Anyway, it seems I have lots of changes to do in the library.

-- 
Vinícius dos Santos Oliveira
https://about.me/vinipsmaker

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk