Boost logo

Boost :

Subject: Re: [boost] [http] Formal Review
From: Antony Polukhin (antoshkka_at_[hidden])
Date: 2015-08-14 16:42:56


2015-08-13 18:38 GMT+03:00 Vinícius dos Santos Oliveira <
vini.ipsmaker_at_[hidden]>:

> This idea is horrible. An HTTP message is not a std::string. HTTP does have
> a string representation, the HTTP wire format from HTTP 1.1, which is
> different than HTTP 2.0, which is different than FastCGI, which would be
> different than a ZeroMQ-based approach and so on and so on.
>

You've totally misunderstood the example. Let's take a look at it:

1) For simplicity let's assume that we are working with HTTP1.1 and HTTP1.0
only. In that case:

namespace http { typedef boost::asio::tcp::tcp::socket socket; typedef
boost::asio::tcp::tcp::acceptor acceptor; }

2) Here's the part that takes care of communications only:

#include <ctime>
#include <iostream>
#include <string>
#include <boost/bind.hpp>
#include <boost/shared_ptr.hpp>
#include <boost/enable_shared_from_this.hpp>
#include <boost/asio.hpp>

std::string make_daytime_string() {
  using namespace std; // For time_t, time and ctime;
  time_t now = time(0);
  return ctime(&now);
}

class http_connection : public
boost::enable_shared_from_this<http_connection> {
public:
  typedef boost::shared_ptr<http_connection> pointer;

  static pointer create(boost::asio::io_service& io_service) {
    return pointer(new http_connection(io_service));
  }

  http::socket& socket() {
    return socket_;
  }

  void start() {
    message_.resize(4 * 1024);

    boost::asio::async_read(socket_, boost::asio::buffer(message_),
        http::completions::full(message_), // read until all the whole
request is in `message_`
        boost::bind(&http_connection::handle_read, shared_from_this(),
          boost::asio::placeholders::error,
          boost::asio::placeholders::bytes_transferred));
  }

private:
  http_connection(boost::asio::io_service& io_service)
    : socket_(io_service)
  {}

  void handle_write(const boost::system::error_code& /*error*/,
      size_t /*bytes_transferred*/)
  {}

  void handle_read(const boost::system::error_code& error, size_t
bytes_transferred);

  http::socket socket_; // manages the connection only
  std::string message_;
};

class http_server {
public:
  http_server(boost::asio::io_service& io_service)
    : acceptor_(io_service, http::endpoint(http::all_versions, tcp::v4(),
80))
  {
    start_accept();
  }

private:
  void start_accept() {
    http_connection::pointer new_connection =
      http_connection::create(acceptor_.get_io_service());

    acceptor_.async_accept(new_connection->socket(),
        boost::bind(&http_server::handle_accept, this, new_connection,
          boost::asio::placeholders::error));
  }

  void handle_accept(http_connection::pointer new_connection,
      const boost::system::error_code& error)
  {
    if (!error) {
      new_connection->start();
    }

    start_accept();
  }

  http::acceptor acceptor_;
};

int main() {
  try {
    boost::asio::io_service io_service;
    http_server server(io_service);
    io_service.run();
  } catch (std::exception& e) {
    std::cerr << e.what() << std::endl;
  }

  return 0;
}

Assuming that http::socket is a tcp::socket, that example is EXACTLY the
same code that is in ASIO example at
http://www.boost.org/doc/libs/1_58_0/doc/html/boost_asio/tutorial/tutdaytime3/src.html

message_ is just a BUFFER for data. You can use std::vector<char>,
std::vector<unsigned char> or std::array<> here. It's not a string, it's
the buffer!

http::completions::full(message_) is a functor, that returns 'Stop reading
data' when our buffer (message_) contains the whole HTTP request.

So, what does it mean? It means that when http_connection::handle_read
function is called, message_ contains whole data received from socket.

3) http::view NEWER downloads data. http::view is a parser, that represents
an input data as an HTTP message and provides a user friendly collection of
methods to READ data.

It means that it could be used just like this:

const char data[] =
    "GET /wiki/HTTP HTTP/1.0\r\n"
    "Host: ru.wikipedia.org\r\n\r\n"
;

http::view v(data);
assert(v.version() == "HTTP1.0");

You've been mislead by http::view::read_state() function. It does not
investigate sockets. It determinates the state from the message content:

const char data1[] =
    "GET /wiki/HTTP HTTP/1.0\r\n"
    "Content-Length: 0\r\n"
    "Host: ru.wikipedia.org\r\n\r\n"
;

http::view v1(data1);
assert(v1.read_state() == http::read_state::empty);

const char data2[] =
    "GET /wiki/HTTP HTTP/1.0\r\n"
    "Content-Length: 12312310\r\n"
    "Host: ru.wikipedia.org\r\n\r\n"
    "Hello word! This is a part of the message, because it is not totally
recei"
;

http::view v2(data2);
assert(v2.read_state() != http::read_state::empty);

4) Why this approach is better?

* It explicitly allows user to manage networking and memory.
* It separates work with network and work with HTTP message.
* It reuses ASIO interface
* It does not implicitly allocates memory
* It can be used separately from ASIO. http::view and http::generator do
not use ASIO at all.
* Your assumption that there is always a requirement to read headers and
body separately is very wrong.
HTTP headers are not so big and usually the whole HTTP message could be
accepted by a single read. So when you force user to read headers and body
separately you're forcing user to have more system calls/locks/context
switches. However read with http::completions::full(message_) in most cases
will result in a single read inside ASIO and a single call to
http::completions::full::operator(). This will result in better performance.
* If there's a user requirement to read headers separately from body, this
could be achieved by http::completions::body + http::completions::headers.
* Want to take care of memory and networking in Boost.HTTP library just
like cpp-netlib does? That's simple, make a wrapper for beginners:

struct handler { // this handler is wirtten by user
    void operator() (http::view const &request,
                     http::generator &response) {
        response = http::response::stock_reply(
            http_server::response::ok, "Hello, world!");
    }

    void log(std::string const &info) {
        std::cerr << "ERROR: " << info << '\n';
    }
};

http_connection::handle_read(const boost::system::error_code& error, size_t
bytes_transferred) {
    if (error) {
        user_handler.log(error.message()); // user provided `struct handler`
        return;
    }

    std::vector<char> response_holder(1024);

    {
        boost::system::error_code e;
        http::view v(message_, e); // represent buffer as http message

        http::generator response(response_holder);

        user_handler(v, response); // user provided `struct handler`
    }

    boost::asio::async_write(socket_, boost::asio::buffer(response_holder),
        boost::bind(&http_connection::handle_write, shared_from_this(),
          boost::asio::placeholders::error,
          boost::asio::placeholders::bytes_transferred));
}

The `struct handler` above is taken from cpp-netlib example:
http://cpp-netlib.org/0.11.1/index.html

To sum up.
If you're attempting to compete with cpp-netlib, then you've started wrong:
* cpp-netlib is very simple to use. If you always require io_service and
coroutines then your library is hard to use.
* headers/body separate reads is not what really required
* no advantages over cpp-netlib

If I've missed some advantages of your library - please highlight them.

How to fix things?

* Provide more functionality than cpp-netlib:
    * Allow users to manipulate memory and networking.
    * Untie the library from networking and allow parts of it to be used on
raw data (http::view/http::generate).
    * ASIO interfaces re-usage and simple migration for users that already
use ASIO for HTTP. (tcp::socket -> http::socket)
    * HTTP2.0 ?
* Simple interface for beginners. If your first example consumes two
screens of text, while cpp-netlib's example consumes 0.5 screen - you'll
loose

-- 
Best regards,
Antony Polukhin

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk