From: Caleb Epstein (caleb.epstein_at_[hidden])
Date: 2005-12-28 11:21:14
On 12/28/05, Christopher Kohlhoff <chris_at_[hidden]> wrote:
> Hi Rene,
> --- Rene Rivera <grafik.list_at_[hidden]> wrote:
> > The sender doesn't discard.
> I didn't mean that the sender discards, only that the datagram
> being sent can be discarded immediately by the TCP/IP stack if
> the receiving socket's buffer is full. The sender is not aware
> of this.
> > If it where any other way the test that I wrote would show
> > "missing" receives.
> It does indeed exhibit many missing receives on two of my
> systems (one Linux, the other Mac OS X, both uniprocessor).
I think I'm seeing dropped packets when I run the test on an SMP Linux
2.4-based system (see below). With Linux 2.6, the packets all appear to
make it to the server, but we have the sync-faster-than-async condition.
> But Calebs results show all messages arriving, as would be
> > expected from the localhost device.
> I suspect that Caleb is using a multiprocessor machine, and so
> the behaviour I described does not happen for him. However on my
> systems with "flow control" enabled it is still necessary to
> include the additional performance optimisations to get the
> substantial improvement. It would be interesting to rerun the
> test on a multiprocessor machine with these changes included.
I've run the tests on an SMP machine and a single-CPU box. The SMP machine
is running a Redhat 2.4.21 kernel and the test actually doesn't work
properly on this platform. It seems that the receiver drops many packets
and, because of the way the test is written, the program crashes when trying
to print the incomplete result data. I added some debugging print
statements that show the async_server receives about 75,000 of the 600,000
packets sent before it is forcefully stopped.
When I run the tests on my single-CPU machine running Linux 220.127.116.11 at
home, the test completes properly. I've tried both the epoll and
select-based reactors and the results are nearly identical (approx. 3x
slower than sync).
I think its difficult to say where the speed difference comes from on these
tests. It may exercise some inefficiencies or limitations of the Linux UDP
stack. Perhaps the syscall overhead of using select/epoll is where all the
performance is lost. Or it could be the architecture of asio that is at
fault. I don't think we can say for sure at this point.
I'll see if I can't knock up a straight socket-based test like this one (e.g.
just POSIX socket calls, no asio) and see if it gets similar results to
Rene's benchmark. The same benchmark run on some other UNIX-based system
(MacOS, *BSD, etc) would be another interesting data-point to have to see if
this is an implementation or a platform issue.
Just a quick back of the envelope calculation. The synchronous test manages
to handle about 65,000 1k messages a second, which amounts to a bandwidth of
approximately 500 Megabits/second. It would certainly be nice to be able to
sustain this level of throughput, but I'm not sure its an entirely realistic
-- Caleb Epstein caleb dot epstein at gmail dot com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk