Boost logo

Boost :

Subject: [boost] [signals1/2] Alternative approach
From: Sebastian Karlsson (sairony_at_[hidden])
Date: 2011-09-27 05:44:23


A while back we evaluated signals 1 & 2 and came to the conclusion
that the performance is to poor for it to be viable for our use. I
have an idea which nets very close to the same invocation performance
as std::vector< boost::function< > > ( an extra shared_ptr<> per ),
with the caveat that ordering within a priority group is unspecified.
Another downside compared to the implementation used now is that it
needs extra bookkeeping per distinct priority group ( although from
the docs for signals1 it would seem it does the same ).

The gist of it is that we maintain a vector< > ( preferably we let the
storage be parameterized ) in addition to a separate map for the
priority groups. The map is keyed on the priority type, and the
bookkeeping is: vector offset, group length & shift. Offset & length
gives us the region in the vector. Shift tells us how much the group
is shifted from its start offset since it was created. So each area is
maintained as a sort of sliding window meets circular buffer. With
this information we need touch no more than the either the last or
first element ( depending on connect / disconnect ) of the groups with
a lower priority than the one the element belongs to. Our vector
contains a std::pair< function< signature >, shared_ptr< unsigned int
> >, the unsigned int is used both as a cookie for the
signals::connection to make sure that the connection is still valid,
as well as a shared offset in regards to the shift of the group. The
arithmetic is very light weight and shouldn't pose a bottleneck.

Although I haven't implemented blocking & consistent disconnect /
connect behavior during invocation yet I'm fairly certain they can be
done in the following way: For blocking we can either use an
additional bool which we branch on when invoking, or we can simply let
the connection temporarily remove itself from the vector and insert
itself once it's not blocking anymore, second solution probably
preferred. Disconnect / connect during invocation is a trickier one.
The way I'm thinking is to let the signal maintain the iterator as a
member. When we disconnect / connect we can see if it equals end (
we're not invoking ), or if it doesn't then we know both the position
as well as we're in the middle of an invocation. This way we can let
the disconnect / connect logic modify the "cursor" in order to make
sure that we get the desired behavior while not polluting our
invocation loop.

Pros:
More cache friendly
Very close to ideal case for invocation

Cons:
Using std::vector<> as a base naturally means we'll suffer from the
usual growth caveats. However, if this matters the user could use a
paged container with a decent growth strategy and come extremely close
to the ideal case anyway.
Thread safety within an invocation seems like it might be costly to
implement, especially for vector<>.
Slower connect / disconnect.

All in all I think it's a big win for what I personally believe to be
the most common usage of the signals library. I've attached a proof of
concept along with a fairly primitive benchmark. There's a fragmented
list version which can be enabled by a def, but it's naturally slow as
hell to setup so it's disabled by default. The fragmented versions
does not use exactly the same random sequence as the others, but
hopefully it's somewhat fair. The vector variants somewhat "cheats" in
that they reserve the space straight up, this is something which I
personally feel should be exposed in the interface in someway anyway
however. Here's some results:

i5-2500k @ 3.3 ghz, decent mem etc:

Using numConnections = 100000 numCalls = 100 numGroups = 200 rndSeed =
0 fragSpread = 10
Generating random sequence takes: 0.002 seconds
Connecting to unfrag list:< function<> >: 0.008 seconds
Connecting to vector< function<> >: 0.007 seconds
Connecting to unfrag signals2< function<> >: 0.051 seconds
Connecting to unfrag signals1< function<> >: 0.166 seconds
Connecting to test::signals< function<> >: 1.38 seconds
...
...
...
vector< function<> > invocation: 0.026
signals< function<> > unfragmented invocation: 1.632
signals2< function<> >unfragmented invocation: 1.404
list< function<> > unfragmented invocation: 0.03
test::signals< function<> > invocation: 0.023
signals< function<> > fragmented invocation: 9.97
signal2< function<> > fragmented invocation: 15.04

PS3 @ PPU ( Power PC ):

Using numConnections = 100000 numCalls = 100 numGroups = 200 rndSeed =
0 fragSpread = 10
Generating random sequence takes: 0.001942 seconds
Connecting to unfrag list:< function<> >: 0.060645 seconds
Connecting to vector< function<> >: 0.034558 seconds
Connecting to unfrag signals2< function<> >: 0.367024 seconds
Connecting to unfrag signals1< function<> >: 2.02785 seconds
Connecting to test::signals< function<> >: 10.0817 seconds
...
...
...
vector< function<> > invocation: 0.695916
signals< function<> > unfragmented invocation: 5.91057
signals2< function<> >unfragmented invocation: 4.90188
list< function<> > unfragmented invocation: 1.03532
test::signals< function<> > invocation: 0.952999
signals< function<> > fragmented invocation: 33.4202
signal2< function<> > fragmented invocation: 47.0125




Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk