Boost logo

Boost :

Subject: Re: [boost] [http] Formal review of Boost.Http
From: Lee Clagett (forum_at_[hidden])
Date: 2015-08-17 01:01:17


On Fri, Aug 7, 2015 at 3:08 AM, Bjorn Reese <breese_at_[hidden]>
wrote:

> Dear Boost and ASIO communities.
>
> The formal review of Vinícius dos Santos Oliveira's Boost.Http library
> starts today, August 7th and ends on Sunday August 16th.
>
> Boost.Http is designed for various forms of modern HTTP interaction,
> from normal HTTP request, over HTTP chunking and pipelining, to
> upgrading to other web protocols like WebSocket. This library builds
> on top of Boost.ASIO, and follows the threading model of ASIO.
>
> The two basic building-blocks are http::socket, which is socket that
> talks HTTP, and http::message with contains HTTP meta-data and body
> information. You can use these building-blocks to build a HTTP server
> that fits your exact needs; for instance, an embedded HTTP server for
> a ReST API. Boost.Http comes with a light-weight HTTP server and a
> static file server.
>
> Currently, Boost.Http is limited to server-side interaction, but the
> design principles used extends to client-side as well.
>
> Boost.Http was originally developed as part of GSoC 2014 and Vinícius
> has continued to develop and improve the library since then.
>
> The documentation can be found here:
>
> http://boostgsoc14.github.io/boost.http/
>
> The source code can be found here:
>
> https://github.com/BoostGSoC14/boost.http
>
> The source code is build using CMake, but Boost.Build is in the pipeline
> (already done for documentation.)
>
> Please answer the following questions:
>
> 1. Should Boost.Http be accepted into Boost? Please state all
> conditions for acceptance explicity.
>

I do not think Boost.Http should be accepted in its current state. My main
reasons (read portions below for expanded comments/thoughts):

(1) Incomplete design - There is no built-in mechanism for writing out
client headers, which means there is no client support.
(2) Unproven design - A low-level abstraction for a "HttpSocket" is
provided, but only two higher-order methods for file serving are provided.
Why aren't there higher-order functions for more common use cases?
(3) More considerations for design choices (or documentation wording)
should be considered - All data reads / writes for the message body require
a copy to be made.
(4) More time is needed to think about the design patterns (this follows
from a lack of (2)) - why is `async_write_response_continue` a requirement
for the http::ServerSocket concept?
(5) Documentation can be better organized and lacks critical information in
a few areas.

 2. What is your evaluation of the design?
>

- There is no built-in mechanism for writing out client headers, which
means there is no client support. The provided rationale is that it took a
long-time to design the server portion, and the client side will take
equally as long. I don't see this as a valid argument; several current
library authors have indicated that providing a "boost-ready" library is an
incredible amount of work. I would expect a boost::http library to be more
feature complete before being accepted.

- An abstraction for a "HttpSocket" is provided, but only a few
higher-order functions are provided (for file serving). The current
lower-level abstractions _should_ allow for useful utilities to be written,
but I would like to see more functionality implemented before boost
acceptance, so the lower-level design is more "proven".

- The http::ServerSocket concept defines methods for writing response
headers, writing the body, writing trailers, and writing an entire response
message at once. This concept provides everything necessary for writing a
valid HTTP response. So why is `async_write_response_continue` mandated in
the http::ServerSocket concept when a free function can be implemented,
that uses the other concept functions? If the reasoning is performance
(which appears to be the reason) - why isn't a more performant method
(probably some form of pre-generation) of sending messages added? This
would prevent the socket concept from continually absorbing additional
special case messages as higher order functions are developed, and allow
for clients to pre-generate error responses.

- `async_read_request`, and `async_write_response_metadata` in the
http::ServerSocket concept require an entire http::Message concept as an
argument, when only the AssociativeContainer concept should be necessary
for this function.

- `async_write` and `async_read_some` in the http::Socket concept requires
an entire http::Message concept as an argument, when only the
SequenceContainer concept should be necessary for this function.

- All data reads / writes for the message body require a copy to be made,
because memory from a SequenceContainer concept cannot be given to ASIO/OS
directly (std::list<std::uint8_t> is a SequenceContainer). The message body
should likely be restricted to the C++1z ContiguousContainer concept
instead. I think users would prefer speed here more than flexibility. Its
also easier to relax requirements after release than the opposite.

- Ideally the read / write functions for the payload would have the same
requirements as an asio buffer. Requiring a container prevents memory
locations from other sources (such as directly from a streambuf), and
prevents scatter gather i/o. I'd say this is less critical, as its
something most users are unlikely to use.

- `async_read_trailers` and `async_writer_trailers` in the http::Socket
concept require an entire http::Message concept as an argument, when only
the AssociativeContainer concept should be necessary.

- I mentioned this before; client and server specific header
writing/reading is a shame due to the overwhelming similarities. My earlier
suggestions were criticized for being stringly typed (or having more
states), but I think this was a somewhat weak criticism since the type
system can be used more effectively.
`async_write_header(http::response{200, "OK"}, ...)` or
`async_write_header(http::request{http::request::GET, "/"}, ...)`.

- It might be worth eventually relaxing the requirements for the key/values
in the AssociativeMap used by the field parsing functions to a subset of
string functions. begin(), end(), size(), empty(), push_back(), pop_back(),
operator<(), a hash implementation, and contigous data. This way a quick
and dirty small string optimization could be written for field handling.

- What is the purpose of the standalone `async_write_response` and
`async_write_response_metadata` functions? They only forward to functions
with the same name in the http::ServerSocket concept, is this cruft?

- I like that all of the asynchronous callbacks take consistent parameters.
asio::coroutine works nicely.

> 3. What is your evaluation of the implementation?
>
>
Overall the quality is good. I didn't go read the code thoroughly, but I
did notice a few things -

- The current `async_write_response` implementation assumes contiguous
elements on the SequenceContainer concept, which is incorrect.

- A std::ostringstream is used, when spirit::karma would be faster and have
less memory allocations (minor).

- The current `async_read_some` implementation has an error code path that
can initiate a write. This is not documented anywhere, and effectively
issuing an `async_read_some` command blocks both the send and receive
channels of the underlying ASIO stream. This problem is likely to occur in
a few other spots, but I didn't check.

- There are some error code paths in async operations that modify the
entire message object (clear() is invoked on all parts of the message).
Seems unnecessary, and is at least undocumented.

 4. What is your evaluation of the documentation?
>

Needs better organization, and someone to edit the rationale section in
particular.

- The reference section is organized strangely. Every class, concept,
function, and file is listed under the same header, and then its broken
into sections at the bottom. I didn't notice the bottom portion at first,
and it should replace the top portion that lacks organization.

- How do the various write functions manipulate the headers? What fields
are added, if any? This is partially mentioned at the bottom of the
http::ServerSocket page, but I saw "content-length: 22" added to my
message, and this was never explicitly stated (although obviously assumed).
This should likely be explicitly specified in the concept itself, AND the
location should be specified (i.e. these fields are explicitly appended).
Should also document that some fields, such as content-length cannot be
listed twice according to the specification, while comma de-limited fields
can be listed multiple times as a valid form of appension. I.e.
"content-encoding: gzip, sdch\r\n" and "content-encoding:
gzip\r\ncontent-encoding: sdch\r\n" indicate equivalent content encodings
applied to the data. Should probably mention that Transfer-Encoding is a
comma delimited list too.

- The `key_type` and `mapped_type` for the headers portion of the
message::Concept indicate that they must meet the "String concept". I don't
recall seeing this concept being defined in C++ or boost, where does it
come from? If its from C++1z, it might be to merge the behavior of
string_ref and basic_string, so that concept would be unlikely to require
storage like a container. Might want to re-think the concept requirements
here, or state that only a conforming std::basic_string implementation is
acceptable for now.

- Mentioned previously - the documentation for basic_socket should state
when a function is using the read or write portion of the stream. Currently
some functions use both unexpectedly, such as async_read_some.

> 5. What is your evaluation of the potential usefulness of the library?
>
>
This library has multiple potentially use cases, especially if a good
low-level design is provided. Additional higher-level functions would be
nice, because many (perhaps most) users have similar requirements, don't
need extreme performance, and just want something simple and relatively
efficient. Hopefully work will be continued on this library, regardless of
the outcome of this review process.

 6. Did you try to use the library? With what compiler? Did you have
> any problems?
>
>
I ran all of the tests on OSX, which had no complaints. I also wrote a
quick asio::coroutine implementation, and viewed the data passing by in
wireshark for good measure.

> 7. How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?
>

I spent a good amount of time reading the documentation, and
implementation. A little less testing / running code.

> 8. Are you knowledgeable about the problem domain?
>

I have fairly extensive knowledge of protocols, and experience providing
implementations for protocols (including HTTP). I also have a reasonable
amount of ASIO experience.

Lee


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk