|
Boost : |
Subject: [boost] [beast] Request for Discussion
From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2016-09-23 10:58:14
Beast provides low level HTTP and WebSocket interfaces built on
Boost.Asio, the project is here:
https://github.com/vinniefalco/Beast
The plan is to solicit feedback from the Boost community before
submitting a request for a formal review.
First a brief update: I have attended CppCon 2016 and provided a 15
minute Lightning Talk on the library. There should be a video up at
some point with all the talks, including mine. I spread the word about
Beast and discovered that there are several folks in the wild already
using the library and happy with it. There are also a couple of
commercial products that are using it.
One point brought up during the conference is that Beast does not
offer sufficient higher level HTTP functionality. This omission is
intentional. It is my idea to offer the building blocks (serializing
and deserializing HTTP/1 messages, in synchronous and asynchronous
flavors, and a universal HTTP message model), upon which other
developers more knowledgeable than me could build those interfaces.
The higher level pubic operations requested were:
1. Perform HTTP GET or POST and receive the response. Presumably this
would handle both plain and TLS connections, handle redirects, and
handle HEAD requests. Not sure about traversing proxies.
2. Expose an asynchronous HTTP server class which callers can
instantiate with some sort of handler that returns the HTTP response
given a request. It should be possible, in a minimal number of lines,
to quickly get a server up and running.
The requestor informed me that this functionality did not need to
provide every possible feature or handle every use case. It just
needed to handle the common use-cases.
For the record, I am against this higher level functionality for a
number of reasons. The higher the level of functionality, the more
contentious the feature becomes, and the harder it is to satisfy a
quorum of reviewers. No one would dispute that we need the ability to
read a complete HTTP message from a socket. But not everyone will
agree that a HTTP server needs an option to allow multiple listening
sockets with different level of permissions.
Another problem with offering higher level functionality is that it
could be less likely to be approved for standardization in the C++
standard library. For example, the attributes of ease of use and fully
functional oppose each other. If the functionality offered by a
hypothetical std::http::client is not sufficient, a developer would be
forced to write their own with little possibility to reuse parts of
the implementation from the stdlib. I doubt the committee would
approve of such a thing. I know I wouldn't. Admittedly I have little
experience or knowledge on what would or would not get approved.
A point was raised that a low-level-only HTTP implementation in Boost
has the potential to discourage new users that need to perform simple
operations like HTTP GET from using Boost in the future, given that
robustly handling such operations would require the developer to write
a significant amount of code. I can see some validity here. On the
other hand, if Boost.Asio was submitted for a formal review today,
would those same points be raised? Would Asio get rejected because it
doesn't offer public interfaces for doing simple tasks like requesting
an object from a web server? I wonder if perhaps the Boost formal
review process has become defective.
For reference here is what the current API offers:
Example program that performs HTTP GET, then receives and prints the response:
https://github.com/vinniefalco/Beast/blob/master/examples/http_example.cpp
This is a high performance asynchronous HTTP server that serves local
files as static pages. It is not an official API, it is part of the
examples. Users who want a server need to copy the source code and
modify it to suit their needs:
https://github.com/vinniefalco/Beast/blob/master/examples/http_async_server.hpp
My initial purpose for requesting this discussion is two-fold:
1. Determine the level of consensus on the issue that Beast needs to
do more for HTTP than the lowest level of functionality (reading and
writing messages). I'm open to hearing all arguments for and against.
2. Get feedback on what official public interfaces to higher level
functionality might look like. Specifically, a client that handles GET
and POST, and a generic asynchronous server. And what features those
interfaces would support. For example, is traversing a proxy a
necessary feature for an official client interface? The best answers
would provide function signatures or class declarations using the
existing Beast HTTP types. In case its not clear, I am asking
stakeholders and reviewers to help design the library.
I'll open a discourse on the HTTP client to illustrate the proposed
style of discussion:
We want to provide a HTTP client implementation that lets users
perform simple tasks like fetch a resource from a remote host. Calling
code should be compact, letting people get their job done without
getting mired in boilerplate. Here's a possible signature with example
use:
response_v1<string_body>
get(std::string const& url, error_code& ec);
error_code ec;
auto res = get("http://boost.org", ec);
if(! ec)
std::cout << res;
else
std::cerr << ec.message() << '\n';
The get function would automatically perform the name resolution,
connect the socket, request the resource with uri "/", and return the
result.
There's a problem here though, the caller is forced to use a message
with std::string as the container for the body. An obvious fix is to
add a template parameter:
template<class Body = string_body>
response_v1<Body>
get(std::string const& url, error_code& ec);
Better, but what if the body is not DefaultConstructible? We can add
some parameters to forward to the Body constructor:
template<class Body = string_body, Args&&... args>
response_v1<Body>
get(std::string const& url, error_code& ec, Args&&... args);
We're getting somewhere now. Most users will use the default body,
while an option exists for advanced users. But the implementation of
get now has a complication. If an error occurs, it still has to
construct a return value.
This problem is likely surmountable but lets table it and start a new
line of reasoning. What if the URI indicates https, e.g.
"https://boost.org"? We can hide that under the hood by using a
separate code path that uses asio::ssl::stream instead of
ip::tcp::socket. But how does the caller provide the root CA
certificates? How does the caller provide a client certificate? We are
left with a client that satisfies two thirds of users but is unusable
for the remaining one third. Is this Boost worthy? Is this std-lib
worthy? I think not.
Lets revisit the SSL issues later and think about a new problem. What
happens to the socket when the request is done? In the proposed
interface, the socket is destroyed. This means you can only make one
request on a connection.
At this point, I think its clear that the current interface is a
non-starter. Here's a better idea. Lets define a class called client
and make get a member function. Now we can have state, including
configurable settings with reasonable defaults.
class client
{
public:
template<class Option>
void
set_option(Option&&);
template<class Body = string_body, Args&&... args>
response_v1<Body>
get(std::string const& url, error_code& ec, Args&&... args);
};
client c;
error_code ec;
auto res = c.get("http://boost.org", ec);
if(! ec)
std::cout << res;
else
std::cerr << ec.message() << '\n';
We still have some of the original problems, but other problems are
solved. The class can keep the socket around, so that a subsequent
call to get with the same server will use the already-open socket. The
caller can set options on the client object instance before invoking
get, such as any TLS certificates.
Unfortunately we have new problems. Where is the io_service? If the
class creates it for you, how many threads? How do I use my existing
io_service, or is that even necessary? The class interface allows the
connection to be kept alive but what if now the caller wants to close
it (i.e. specify "Connection: close" in the request)?
And there are some problems which are applicable to any interface. How
does the caller adjust the parser options, for example change the
limit on the size of the body? What if the caller wants the request
made using a specific HTTP version? How does this handle basic
authentication? How can the caller adjust the headers in the request?
etc...
This is why I think that offering a HTTP client is a rabbit hole which
can only lead to a rejected formal review. Or maybe I am overthinking
this? I'd love to hear proposed solutions to the issues (which could
be just to say "forget about those features").
Despite my objection I have started work on these features anyway.
Here are the beginnings of a HTTP client:
https://github.com/vinniefalco/Beast/blob/88b485df7e6216282842f40cf99cc75dee1b82d4/test/http/get.cpp#L123
Here is some work on a generic server:
https://github.com/vinniefalco/Beast/blob/server/extras/beast/http/async_server.hpp#L84
Please reply to this thread instead of starting a new thread, and keep
the subject line prefix (e.g. "Re: [boost] [beast] ...")
Thank you
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk