From: Jody Hagins (jody-boost-011304_at_[hidden])
Date: 2004-11-30 23:13:43
On Sun, 17 Oct 2004 19:36:39 -0500
"Aaron W. LaFramboise" <aaronrabiddog51_at_[hidden]> wrote:
> On the other hand, as Alexandrescu in _Modern C++ Design_ quotes Len
> Lattanzi, "Belated pessimization is the leaf of no good."
> Alexandrescu goes on to write, "A pessimization of one order of
> magnitude in the runtime of a core object like a functor, a smart
> pointer, or a string can easily make the difference between success
> and failure for a whole project" (Ch. 4, p. 77). Library writers
> often do not have the same liberty as application writers to go back
> later, after profiling, and fix their slow code.
Right. My application is entirely event drive, with boost:signal()
dispatching every event. The application will be processing thousands
of messages per second, each one causing at least one signal()
> Preliminary analysis suggests that the performance efficiency of the
> demultiplexor when using this dispatcher is "bad." While I can not
> yet offer meaningful numbers, it seems that there is something quite
> slow happening under the covers in the Signals library. I have not
> examined the implementation of the Signals library at all, however.
> The extent of this problem, and whether it is easily fixable, remains
> to be seen.
I have not done much examination either, but to satisfy my curiosity, I
hacked together a "lite" implementation of the signals interface that
provides *minimum* functionality. Same interface as boost::signal<>,
w.r.t. connect(), disconnect() and operator() (i.e, dispatching a
signal). Also, allows connect/disconnect/replace inside a slot handler.
It does not provide return value combining and the fancier features.
However, I can do this...
#define PE_SIGNAL_TEMPLATE lite::signals::Signal
#define PE_SIGNAL_TEMPLATE boost::signal
and easily switch between boost::signal and my lite version. Thus, I
can verify that my code compiles and runs the same (and I can verify
that the tests give the same results for either implementation).
However, a very quick test shows that boost::signal is at least 2 orders
of magnitude slower than the "lite" version. Note that this code
compiles/runs on g++, and I have not tried it on other compilers.
The attached files...
Connection.hpp and Signal.hpp implement the "lite" signal interface.
speed_test.cpp is a first glance at speed comparisons between
boost::signal and the "lite" version.
pt50.txt is output of running the test on a SuSE based Opteron.
shandalle.txt is output of running the test on a RH7.3 based Xeon.
The format is...
===== 1000000 Total Calls =====
Num Slots Calls/Slot Boost Lite
--------- ---------- ------- -------
1 1000000 17.9360 0.0701
10 100000 5.3777 0.0289
50 20000 4.0621 0.0244
100 10000 3.9715 0.0232
250 4000 3.8375 0.0254
500 2000 3.8873 0.0237
1000 1000 3.7644 0.0229
5000 200 4.0361 0.1169
10000 100 4.0450 0.1678
50000 20 3.9394 0.1953
100000 10 3.9916 0.2268
500000 2 3.8303 0.1128
The first line means that we connected 1 slot to the signal, and called
signal() 1000000 times, resulting in a total of 1000000 slot invocations
(sortof... there are a few extra). It took 17.9360 seconds to make
these calls with the boost version, and 0.0701 seconds to make these
calls with the lite version.
The last line, obviously, means that we connected 500000 slots to the
signal, and called signal() 2 times, resulting in the same number of
slot invocations, with a time of 3.8303 seconds for boost and 0.1128
seconds for lite.
Note that there seems to be some heavy overhead just in calling
signal(). The output files show several different size runs, on two
different architectures/compiler versions.
The test is not really that great, but is, I think, a reasonable first
attempt at measuring the performance. Of course, I understnad that I
may be measuring the worst case features of signal, and the lite version
does not even come close to the quality, depth, or breadth of the boost
implementation. However, it would be nice if someone could give
boost::signal a "boost" for what I think are very common use cases. It
seems that using it in a simple way requires a heavy price for features
that are not used. This seems to fly in the face of common library
design (where we ATTEMPT to make users not have to pay for things they
are not using).
Also, note that the lite version degrades as the number of slots
increases, indicative in the overhead of iterating through the list of
"slots." The overhead difference is measurable between using std::list
and std::vector, but std::list is easier to implement supporting
connect/disconnect from within a "slot" handler.
P.S. I hope no one sees this as a slam on boost::signal or Doug, as I
feel totally the opposite and am extremely grateful for the
Boost.Community. In fact, I'd really like someone to point out where my
test is woefully flawed, or my use if boost::signal is terribly
misguided, or some other lame brain mistake of mine.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk