Boost logo

Boost Users :

From: dizzy (dizzy_at_[hidden])
Date: 2008-08-09 08:07:46


Hello

I have recently started to rework a network server project that uses a thread
per connection (with blocking/synchronous network I/O) handling model to
using asio and an asynchronous operation model (where the number of threads
should depend on hardware resources and be decoupled of software design
constraints).

The problem I have with this is that it makes one write too much boiler plate
code for all those "completion handlers". Take very this simple example of
code that is using the one thread per connection synchronous I/O model:

// synchronously resolves, connects, throws on error
connection con("1.2.3.4");

{
        // application "packet", usually known as "record"
        // uses "con" and starts an application packet with a given type
        packet pack(con, PacketTypeRequest);

        // serializes portably given data and synchronously sends the data
        // throws on error
        pack << uint32(10) << std::string("/home/user") << uint8(3);

} // calls packet::~packet() which signals in the "con" stream end of packet

Now to transform this very simple (I repeat, this is a very simple example, a
lot more complex examples write "packets" of dozens of fields of which the
last one may have "infinte" length meaning you cannot know the size of it to
put it in the stream and you just have to send it as much as you can and then
signal the end of the "packet") into an asynchronous version one would need:

connection con; // breaks invariants of "con" being always connected

// should asynchronously resolve and connect
con.async_connect("1.2.3.4", handle_connect);
// break code flow here

// handle_connect()
// breaks invariant of allowing "serialization" only after packet type
// has been sent over the network
packet pack(con);

pack.async_serialize(uint32(10), handle_serialize1);
// return to caller

// handle_serialize1
pack.async_serialize(std::string("/home/user"), handle_serialize2);
// return to caller

// handle_serialize2
pack.async_serialize(uint8(3)), handle_serialize3);
// return to caller

// handle_serialize3
// breaks RAIIness of original packet which automatically signaled
// "end" from the dtor of it
pack.async_end, handle_endpacket);
// return to caller

And imagine that the original code was just a small function a big class, so
now each such small function transforms into dozens of smaller functions, the
code explosion is huge.

I am curious to the code practices that some of you employ to solve these
issues (to still have compile time checked code as much as possible by strong
invariants and RAII idioms and not have to write millions of small functions).

Some of the things I have thought of that seem to solve these issues:

- instead of packets being adhoc serialized have structures encapsulating the
network packets and have serialization code moved into them and let them
deal with all the small functions (they could use some buffer to cache
serialization of the fixed fields and async_write that buffer contents in a
single operation); this however means an additional copy of the data vs how
the code was before and it just moves the problem, instead of having many
small functions in the high level code you have them in the lower level packet
structures serialiation code (thu the output buffer being can reduce some of
them)

- using template expressions or whatever do some kind of "lazy evaluation";
basically still use syntax similar to the synchronous solution like:
pack << uint32(10) << std::string("/home/user") << uint8(3);
but this code instead of doing network I/O would enqueue the serialization
actions needed have all those completion handlers internally and completely
abstract to the user all those details; the user just does a
pack.async_write(user_handler) and "user_handler" will be called after all the
serializations have been asynchronously written

- if this weren't C++ but C then we could use setjmp/longjmp to transparently
(to the user) simulate synchronous operation while asynchronous operation is
done behind the scenes; the user writes code such as:
pack << uint32(10) << std::string("/home/user") << uint8(3);
but what it does is on each op<< (each asynchronous serialization) the code
does an async_write() with an internal handler, saves the context (setjmp)
then longjmp()s to the previously saved context in the main event loop; when
the async_write completes the handler does setjmp to restore the original user
context and continue with the next op<< and so on; this however does not work
in C++ because the jumped code path with longjmp may have exceptions
being thrown/catched and as the standard says that's UB not to mention form
what I could gather on the Internet some C++ compilers call dtors of auto
objects when you longjmp "back" thus not leaving the original context
untouched (which is what I need)

-- 
Mihai RUSU
                      "Linux is obsolete" -- AST

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net