Boost logo

Boost :

From: Robert Zeh (razeh_at_[hidden])
Date: 2004-12-02 09:32:21


I profiled speed_test.C with quantify to determine where the signals
library was spending its time. On my SPARC/Solaris system with gcc
3.3.2 it was spending about 1/3 of its time in malloc, mostly as the
result of the cache construction inside of slot_call_iterator.

At home I decided to try replacing the shared_ptr in
slot_call_iterator with an instance variable of the result_type.
Since slot_call_iterator uses the shared_ptr to determine if the cache
is valid, I had to add a bool indicating if the cached value is valid
or not. These changes made the benchmark run about two to four times
faster (on my home system, a 650 MHz Duron with gcc 3.3.2 running
Debian). I expect better results with my SPARC machine, because the
malloc implementation seems to be slower.

There are several drawbacks to my modifications. It's harder to
maintain because of the added bool. The result_type used by
slot_call_iterator must now have a default constructor. If it is
expensive to copy the result_type, and slot_call_iterator is copied a
lot, replacing the shared_ptr with an instance variable will actually
make things slower.

I don't know enough about the internals to weigh how important these
issues are.

I believe the correct way to do things is to create a cache interface
that encapsulates the behavior the slot_call_iterator needs, and to
then choose the appropriate cache implementation at runtime using the
mpl.

I've attached both a diff for my changes to slot_call_iterator.hpp and
the modified file. I'd be interested in knowing how it changes the
performance on other platforms.

Current performance on my 650 MHz Duron

===== 1000 Total Calls =====
Num Slots Calls/Slot Boost Lite
--------- ---------- ------- -------
        1 1000 0.0073 0.0002
       10 100 0.0025 0.0001
       50 20 0.0023 0.0001
      100 10 0.0021 0.0001
      250 4 0.0021 0.0001
      500 2 0.0024 0.0001
     1000 1 0.0026 0.0002

===== 10000 Total Calls =====
Num Slots Calls/Slot Boost Lite
--------- ---------- ------- -------
        1 10000 0.0680 0.0022
       10 1000 0.0258 0.0007
       50 200 0.0214 0.0006
      100 100 0.0209 0.0006
      250 40 0.0209 0.0006
      500 20 0.0218 0.0008
     1000 10 0.0246 0.0008
     5000 2 0.0254 0.0025
    10000 1 0.0257 0.0027

===== 100000 Total Calls =====
Num Slots Calls/Slot Boost Lite
--------- ---------- ------- -------
        1 100000 0.7370 0.0224
       10 10000 0.2573 0.0090
       50 2000 0.2284 0.0067
      100 1000 0.2691 0.0072
      250 400 0.2111 0.0069
      500 200 0.2202 0.0094
     1000 100 0.2635 0.0259
     5000 20 0.2684 0.0318
    10000 10 0.2749 0.0330
    50000 2 0.2629 0.0266
   100000 1 0.2638 0.0290

===== 1000000 Total Calls =====
Num Slots Calls/Slot Boost Lite
--------- ---------- ------- -------
        1 1000000 6.9143 0.2232
       10 100000 2.5620 0.0752
       50 20000 2.2086 0.0669
      100 10000 2.1817 0.0724
      250 4000 2.1839 0.0894
      500 2000 2.1643 0.1066
     1000 1000 2.6981 0.3238
     5000 200 2.7720 0.3870
    10000 100 2.7220 0.3980
    50000 20 2.7763 0.3479
   100000 10 2.8006 0.3774
   500000 2 2.6703 0.2991

slot_call_iterator with an instance variable instead of a shared_ptr
performance on my 650MHz Duron.

===== 1000 Total Calls =====
Num Slots Calls/Slot Boost Lite
--------- ---------- ------- -------
        1 1000 0.0014 0.0002
       10 100 0.0004 0.0001
       50 20 0.0003 0.0001
      100 10 0.0406 0.0001
      250 4 0.0003 0.0001
      500 2 0.0004 0.0001
     1000 1 0.0007 0.0002

===== 10000 Total Calls =====
Num Slots Calls/Slot Boost Lite
--------- ---------- ------- -------
        1 10000 0.0145 0.0022
       10 1000 0.0039 0.0007
       50 200 0.0030 0.0006
      100 100 0.0029 0.0006
      250 40 0.0029 0.0006
      500 20 0.0033 0.0008
     1000 10 0.0066 0.0009
     5000 2 0.0073 0.0025
    10000 1 0.0076 0.0029

===== 100000 Total Calls =====
Num Slots Calls/Slot Boost Lite
--------- ---------- ------- -------
        1 100000 0.1446 0.0219
       10 10000 0.0385 0.0075
       50 2000 0.0302 0.0066
      100 1000 0.0288 0.0065
      250 400 0.0296 0.0068
      500 200 0.0344 0.0083
     1000 100 0.0949 0.0257
     5000 20 0.0844 0.0318
    10000 10 0.0819 0.0329
    50000 2 0.0825 0.0294
   100000 1 0.0741 0.0282

===== 1000000 Total Calls =====
Num Slots Calls/Slot Boost Lite
--------- ---------- ------- -------
        1 1000000 1.5345 0.2376
       10 100000 0.6345 0.0758
       50 20000 0.6562 0.2402
      100 10000 0.2960 0.0686
      250 4000 0.3477 0.0932
      500 2000 0.5850 0.1311
     1000 1000 1.7935 0.3861
     5000 200 1.0274 0.4182
    10000 100 1.1677 0.4073
    50000 20 1.1002 0.7818
   100000 10 1.5208 0.4197
   500000 2 0.7921 0.2938

http://home.earthlink.net/~rzeh
Robert Zeh





Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk