Boost logo

Boost :

From: Jeremy Maitin-Shepard (jbms_at_[hidden])
Date: 2004-01-28 15:16:03


Sean Kelly <sean_at_[hidden]> writes:

> Jeremy Maitin-Shepard wrote:
>> It seems that network sockets are the most common use case for
>> nonblocking I/O multiplexing (i.e. the reactor pattern). If a library
>> restricts itself to those alone, an efficient portable implementation
>> may be possible.

> I would say restrict it to socket and file i/o and expect that it would be used
> almost exclusively for socket i/o.

As far as I know, it is not possible to poll file descriptors (HANDLEs)
using the Windows API, either through something like WSAAsyncSelect
(i.e. sending messages to a Windows message queue), WSAEventSelect
(waiting on an event, which is not a great solution since it requires
use of WFMO, which is limited to 64 events). This can't even be solved
by using multiple threads --- I believe that polling is simply not
supported on Windows. Asynchronous completion of socket read
operations and file read operations can indeed be unified, but I would
not suggest that a boost library not implement polling.

>> On the other hand, I believe it would be possible to create a portable
>> library which utilized either WSAAsyncSelect or select on Windows
>> platforms, kqueue on BSD platforms, epoll on Linux kernels that support
>> it, and poll on other POSIX platforms. (On Solaris, I believe it would
>> be possible to utilize /dev/poll.)

> I'd like to focus on completion routines or methods that could mimic them. IOCP
> in Windows, /dev/poll in Solaris and either /dev/poll or epoll in Linux, perhaps
> kqueue in BSD? Would all of those serve? My experience is mostly with IOCP but
> there's a Dr. Dobbs article this month that uses /dev/poll in the same
> model.

AFAIK, neither /dev/poll nor epoll provide similar functionality to
completion ports. /dev/poll, epoll and kqueue poll a file descriptor
for _available_ data to read, _available_ buffer space to write,
_available_ out of band data to read, in the case of a connecting
socket, a socket which has connected, and in the case of a listening
socket, an _available_ new connection to accept. Windows completion
ports, in contrast, allow you to determine when a particular write
operation or read operation has been completed. Thus, this is a
fundamentally different model. Windows completion ports are similar to
using the aio_* family of functions on POSIX platforms, but on POSIX
platforms, it is generally more convenient and efficient to use polling.

The nature of the aio_* functions is that they do not efficient scale to
waiting on a large number of concurrent operations. Specifically, it
often necessary to check the status of each operation after the
aio_suspent function returns. Systems like epoll, /dev/poll, and kqueue
scale very well to a large number of file descriptors. On Windows
platforms, it seems that systems like WSAAsyncSelect, WSAEventSelect,
select, and WFMO do not scale very well to a large number of socket
descriptors/handles. The fact that Windows only scales efficiently
using one model, while UNIX platforms only scale efficiently using the
other model, is an added and particularly problematic issue in writing a
portable library. Avoiding this would require a very substantial amount
of abstraction, and I would argue that mandating that layer of
abstraction on all users would make the library less flexible to
certain tasks.

>> [snip]

> Windows has asynch listen but it's kind of annoying to use. I've always used
> one or more threads for the task--overhead is minimal since they're basically
> always blocking. Beyond that, I think the i/o layer should be a thread pool
> that does i/o processing exclusively. Since it's a thread pool some
> synchronization is already required and extending execution into the rest of the
> application code seems like an invitatation to trouble.

I believe it would be possible to create a portable library which used
either asynchronous operations or polling, and automatically accepted
all incoming connections, read all available data, and provided a send
buffer which would get sent automatically. This could all be done using
only a single thread. The user could specify certain limits on the
number of connections to accept, and the size of the read and write
buffers. I do not think it is possible to create a significantly lower
level interface which is also portable and efficient.

> [snip]

-- 
Jeremy Maitin-Shepard

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk