|
Boost : |
From: Giovanni P. Deretta (gpderetta_at_[hidden])
Date: 2005-12-30 13:50:36
Hi everybody, I hope not to have missed the review deadline...
First of all I vote *not* to accept asio in boost yet. I think that the
library is very promising but there are some issues. If the library is
not accepted, is strongly encourage the author to subimit it again once
the issues are resolved.
- What is your evaluation of the design?
Promising.
I think that the most important innovation of asio is the dispatcher
concept and the async call pattern. Everything else is just *extra*.
It probably should be removed in some way from asio and made a separate
boost library (or may be just rename asio to something like "asynch lib"
or whatever).
The interface provided by asio is very low level, but i think this is a
strenght of asio, not a weakness. Higher level interfaces (iostreams,
buffer management etc...) can be added later as external components or
extensions to the library itself.
The {datagram|stream}_socket interface is nice, but i have some issues
with them.
First of all the name "socket" is too much connected with the BSD socket
API. I think that asio should go well beyond it and not limit itself to
the system socket types (not that it does, but the names might seem to
imply that). This is just a personal preference though.
Also i think there should be *no* asio::stream_socket. This is my major
The current stream socket should be in the namespace asio::ipv4, and
should be a different type for example from an eventual
asio::posix_pipe::stream_socket or asio::unix::stream_socket. Especially
the last can be currently trivially implemented by defining the
appropriate protocol. This means that a stream_socket initialized with a
ipv4::tcp protocol will interoperate with a stream_socket initialized
with a unix::stream protocol. For example, currently i can use my
hipotethical unix::stream with ipv4::resolver. Asio should not be type
unsafe only because the BSD API is. Of course both should share the same
code, but it should be an implementation detail.
I like the sync and asynch read/write interface and I have no problem
with them being in the same object (i.e. no separation of asynch socket
and sync socket). I agree that the proactor patter is the more portable
than the reactor one, especially if asio is extended to filesystem i/o.
I think the interface could be extended for better efficiency (i will
try to describe later how), but this can be done at a later stage.
I don't really like a lot the service concept. While I agree that the
service repository is useful to hold the various proactor
implementations (iocp_service, reactive_service<kqueue>,
reactive_service<epoll>, etc...), i don't think that the streams should
care about them. As long as the impl_type are interoperable, any stream
should be bindable with any service. This also means that the binding
with the demuxer should *not* be done at creation time, but at open
time. This will make it possible to have different openers: connector,
acceptor, unix::socket_pair, posix::pipe etc. The openers are instead
created with a demuxer reference and will pass it to the socket when
opened. By the way, the implementation of socket functions (accept,
connect, read, write, etc) should not be members of the demuxer service
but free functions or, better, static members of a policy class.
The buffer concept does not really add much, simply using void pointers
and size_t length parameters might require less conceptual overhead.
They don't add much security (it should be the responsibility of higher
layers along with buffer managements), and in practice don't help a lot
with scatter/gather i/o: often you can't reuse an existing vector of
iovecs, because you need to do an operation from/to the middle of it and
thus you need to build a temporary one. As the iovector is usually small
(16 elements may be?) and can be stack allocated, I think that having a
global buffer type is a premature optimization. Instead there should be
a *per stream type*, because a custom stream might use a different
iovector implementation that the system one ( io_vec or whatever). It
should have at least push_back(void*, size_t) and size() members.
Scatter gather operations should accept an iterator to first element of
the operation (usually the first element of the vector).
The current interface let the user put a socket in non blocking mode,
but there is not much that can be done with that, because no reactor is
exported. The various reactors/proactor implementations should be
removed from the detail namespace and be promoted to public interfaces,
albeit in their own namespace (i.e. win32::iocp_proactor,
posix::select_reactor, linux::epoll_reactor etc...). This change will
make the library a lot more useful.
The buffered stream is almost useless. Any operation requires two
copies, one from kernel to user space, and one from the internal buffer
to the user buffer. The internal buffer should be unlimited in length
(using some kind of deque) and accessible to eliminate copies. An
interface for i/o that does not require copying would be generally
usefull and not limited to buffered streams.
- What is your evaluation of the implementation?
Good. The code is very clean and readable. Often is more useful than the
documentation. There are inefficiencies caused by too much dynamic
allocations, mutexes and unnecessary system calls, but they are not
intrinsic in the desing and I see from other posting that they are
currently being addresed by the author.
Probably there should be more code reuse. For example the
asio::ssl::stream duplicates much of the code of asio::stream_base.
I think that the buffer function does not support vectors with custom
allocators, but i might be missing something.
On some systems there is already an asynch resolver call (glibc provides
getaddrinfo_a). Asio should use them instead of its own thread based
emulation. The glibc api still use threads, but a strictly single
threaded implementation is theoretically possible.
The implemenation is header only. It makes the library easier to use,
but network system headers are known to pollute the enviroment :)
- What is your evaluation of the documentation?
Insufficient. One of the best thing of boost is the quality of its
documentation. I don't think that asio is on par with boost standards.
While the examples are good and well described, the API documentation is
not very useful. It really shows its autogenerated-ness :)
- What is your evaluation of the potential usefulness of the library?
Extremely useful. There have been lots of requests and proposal for a
boost.net library. Asio is certanly the most promising.
- Did you try to use the library?
While I didn't use all of it (No timers nor SSL), as an experiment I did
write a simple continuation library using asio::demuxer as a scheduler
and using asio callback pattern to restart coroutines waiting for i/o.
Asio's demuxer cleaness and callback guarantees made the implementation
very straightforward. If some one is interested i may upload the code
somewhere. Currently it is posix only (it uses the makecontext family of
system calls), but it should be fairly easy to implement win32 fiber
support.
- How much effort did you put into your evaluation? A glance? A quick
reading? In-depth study?
In depth. I've read most of the documentation and studied the
implementation, or at leas part of it.
- Are you knowledgeable about the problem domain?
Yes, a bit. I've written my own network library, although synchronous
only. I've studied the asynch problem in depth, but I've only
theorethical knowledge of it. I did not add async support to my library
because I couldn't find a good async interface.
---- At this point the review if finished. I'will continue with a wish list of the things i would like to see in asio, but don't necessarily belong to a 1.0 version: - Asynchornous completion Tokens. While callbacks are useful most of the time, sometimes ACT are a bettere solution. It would be nice if asio did let the user specify an act instead of a callback to async functions. Then the demuxer would act as a container of ACTs that could be enumerated. - Lazy allocation of buffers. Instead of passing a buffer or a list of buffers to an async io function, a special allocator is passed. When the buffer is needed (before the async syscall on real proactors and before the non blocking call, but after polling on emulated proactors), the allocator is called and it will return a buffer to be used for the operation. The buffer is then passed to the handler. It will eliminate the need to commit buffer memory for inactive connections. This mostly makes sense for reading, but it might be extended to writing too. This optimization make it possible to add... - Multishot calls. Instead of the regular "exactly one callback for async_call", a client might register a callback for multiple wakeups. For example a async_multi_read_some will be called with a buffer allocator and a callback as parameter and will return a registration token. Every time some data is available from the socket, a buffer is allocated, data is read and a callback is done. Note that the same callback is reused, so there is no need to allocate and copy multiple callbacks. This optimization, for example, will greatly reduce the number of syscalls done in reactive demuxer. The notification of interest to the system demuxer is only done once, then as long as the read syscall does not block, no other syscalls need to be done. This will probably be useful mostly with datagram servers. The same optimization can be done to async_accept. - Relinquising ownership of buffers or lending them. While the normal BSD api requires the copying of buffers anyway, there are streams that can be implemented without copying. For example a process-local shared memory stream can be implemented by simply moving pointer to buffers between threads. A buffered stream might take ownership of a supplied buffer and add it to its own buffer list. Giving and taking ownership is not the only action possible, but a immutable buffer could be loaned to a stream. This is useful for caching. For example, when reading from a file a cached file stream can be used. The stream will check in the cache if the file page is present, and if so, it will return a shared, reference counted, pointer to the page as an immutable buffer. As long as the caller only reads from the supplied buffer (for example to send it to a remote host), no copying is required. I think that's all ... for now :). Sorry for the long review. --- Giovanni P. Deretta
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk