Boost logo

Boost :

Subject: Re: [boost] [http] Formal Review
From: Vinícius dos Santos Oliveira (vini.ipsmaker_at_[hidden])
Date: 2015-08-15 05:54:45


2015-08-14 17:42 GMT-03:00 Antony Polukhin <antoshkka_at_[hidden]>:

> You've totally misunderstood the example.
>

Yes, I'm sorry. The debate with Lee helped me to realize my mistake:
http://article.gmane.org/gmane.comp.lib.boost.devel/262306

Let's take a look at it:
>
> 1) For simplicity let's assume that we are working with HTTP1.1 and HTTP1.0
> only.
>

As long as the design doesn't hamper support for alternative HTTP backends.

In that case:
>
> namespace http { typedef boost::asio::tcp::tcp::socket socket; typedef
> boost::asio::tcp::tcp::acceptor acceptor; }
>
> 2) Here's the part that takes care of communications only:
>
> [...]
>
> Assuming that http::socket is a tcp::socket, that example is EXACTLY the
> same code that is in ASIO example at
>
> http://www.boost.org/doc/libs/1_58_0/doc/html/boost_asio/tutorial/tutdaytime3/src.html
>

You still have to defend why discard all HTTP characteristics from a HTTP
library is an advantage for an HTTP library.

message_ is just a BUFFER for data. You can use std::vector<char>,
> std::vector<unsigned char> or std::array<> here. It's not a string, it's
> the buffer!
>
> http::completions::full(message_) is a functor, that returns 'Stop reading
> data' when our buffer (message_) contains the whole HTTP request.
>

In my design is "stop reading data when our buffer is full or contains the
whole HTTP request". The "growable" thing is the HTTP message, which can be
customized and is decoupled from the HTTP server.

So, what does it mean? It means that when http_connection::handle_read
> function is called, message_ contains whole data received from socket.
>

How to handle progressive download?

3) http::view NEWER downloads data. http::view is a parser, that represents
> an input data as an HTTP message and provides a user friendly collection of
> methods to READ data.
>
> It means that it could be used just like this:
>
> const char data[] =
> "GET /wiki/HTTP HTTP/1.0\r\n"
> "Host: ru.wikipedia.org\r\n\r\n"
> ;
>
> http::view v(data);
> assert(v.version() == "HTTP1.0");
>
> You've been mislead by http::view::read_state() function. It does not
> investigate sockets. It determinates the state from the message content:
>
> const char data1[] =
> "GET /wiki/HTTP HTTP/1.0\r\n"
> "Content-Length: 0\r\n"
> "Host: ru.wikipedia.org\r\n\r\n"
> ;
>
> http::view v1(data1);
> assert(v1.read_state() == http::read_state::empty);
>
>
> const char data2[] =
> "GET /wiki/HTTP HTTP/1.0\r\n"
> "Content-Length: 12312310\r\n"
> "Host: ru.wikipedia.org\r\n\r\n"
> "Hello word! This is a part of the message, because it is not totally
> recei"
> ;
>
> http::view v2(data2);
> assert(v2.read_state() != http::read_state::empty);
>

Your design is very much like cpp-netlib, where the whole request is read
at once. Again: how to handle progressive download?

But, forgetting about specifics and focusing on the opposition of
socket+message vs socket+buffer+view/parser, I might comment more later
today.

4) Why this approach is better?
>
> * It explicitly allows user to manage networking and memory.
>

That's difficult to achieve with support for multiple HTTP backends. Each
HTTP backend might have its own guarantees.

If I only need to provide guarantees related to a single backend, I do this
with current design ideas (need updates in the implementation).

* It separates work with network and work with HTTP message.
>

Can you elaborate further (and also elaborate why is better and how current
Boost.Http approach is worse)?

* It reuses ASIO interface
>

By dropping HTTP information/power.

I like the http::stream adapter idea better. Adapt the HTTP socket to mimic
Asio design. You don't lose features.

* It does not implicitly allocates memory
>

The parser can end up allocating memory. You have to consider support for
different HTTP backends. You can give guarantees about one backend (or
anyone you provide), but not for all of them.

* It can be used separately from ASIO. http::view and http::generator do
> not use ASIO at all.
>

The view and generator will be different for each HTTP backend. The only
view that I believe will be useful without Asio will be the HTTP wire
format parser. I don't expose one yet. I didn't proposed a design for an
HTTP parser.

* Your assumption that there is always a requirement to read headers and
> body separately is very wrong.
> HTTP headers are not so big and usually the whole HTTP message could be
> accepted by a single read. So when you force user to read headers and body
> separately you're forcing user to have more system calls/locks/context
> switches. However read with http::completions::full(message_) in most cases
> will result in a single read inside ASIO and a single call to
> http::completions::full::operator(). This will result in better
> performance.
>

After issuing a read request, more than one component can be read (hence
the need for read_state). It's the same callback who will handle "excess"
of components read.

Also, reading headers before body is important thanks to HTTP 100-continue.
To design an HTTP library, you just don't use your C++ knowledge, but you
have to research a lot about HTTP itself.

* cpp-netlib is very simple to use. If you always require io_service and
> coroutines then your library is hard to use.
>

I started with a strong core proposal. High-level abstractions will follow.
You have much more flexibility with Boost.Http than cpp-netlib (and many
more libraries out there).

For a high-level API, I rather prefer something like this:
https://github.com/d5/node.native

It supports more HTTP features than cpp-netlib and it is much less
coupled/limiting. You could have compared how much verbose Boost.Http is
compared to node.native, which at least has comparable features.

* headers/body separate reads is not what really required
>

So you'll exhaust the application memory during live video streams. Not
gonna change that. But you gave good ideas and a good motivation, to make
the interface simpler. Some algorithms to executing the whole read for you
could be provided.

* no advantages over cpp-netlib
>

A few:

   - With the same API, you gain support for alternative HTTP backends.
   - Modern HTTP API with support for HTTP 100-continue, HTTP chunking,
   HTTP pipelining and HTTP upgrade.
   - It uses the Asio extensible asynchronous model that you can turn into
   blocking/synchronous easily (cpp-netlib has **two** server implementations).
   - It uses an active style, so you have more control. You can, for
   instance, defer acceptance of new connections during high load scenarios.
   - It doesn't force a thread pool on you. All network traffic (from
   Boost.Http and any other traffic too) can be handled on the same io_service.
   - It has strong support for asynchronous operations. You can
   progressively download the body, so memory won't be exhausted while an HTTP
   client submits a live video stream to your server. And you can also respond
   to messages with a live video streaming, checking with standardized API if
   the underlying channel supports streaming.

How to fix things?
>
> * Provide more functionality than cpp-netlib:
> * Allow users to manipulate memory and networking.
>

Okay. This could be improved.

    * Untie the library from networking and allow parts of it to be used on
> raw data (http::view/http::generate).
>

The current library doesn't try to compete with HTTP parsers. An HTTP
parser is not exposed.

    * ASIO interfaces re-usage and simple migration for users that already
> use ASIO for HTTP. (tcp::socket -> http::socket)
>

No, tcp::socket is stream-oriented. http::socket is HTTP oriented. HTTP is
not stream (like TCP) nor datagram (like UDP) oriented. HTTP is an
specialization (request-reply) of the message approach.

    * HTTP2.0 ?
>

Okay.

* Simple interface for beginners. If your first example consumes two
> screens of text, while cpp-netlib's example consumes 0.5 screen - you'll
> loose
>

Okay.

-- 
Vinícius dos Santos Oliveira
https://about.me/vinipsmaker

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk