Subject: Re: [boost] [Heaps] benchmarks
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2011-06-08 07:19:02
Tim Blechmann wrote:
>> I was unable to try the proposed d_ary_heap because I couldn't find a
>> working syntax for the arity template parameter - any ideas anyone?
> :) the scope operator uses two colons:
>> I was expecting the b_heap to be faster, but it is not; it is
>> substantially slower. What is going on there? Does it all come down
>> to size of data, cache size, etc? Ideally, an "improved"
>> implementation like this would be able to fall back to the naive
>> implementation in cases where it does not function any better. Or
>> maybe I'm using it wrongly.
> well, benchmarking this can be tricky ... first, your work is boost::funtion<>
> (iirc 3 pointers) and a float ... so 16 bytes on 32bit or 28 bytes on 64bit ...
> on 32bit systems you will have a better matching with actual physical cache
> lines ...
> second ... synthetical benchmarks are not necessarily the best way to test
> cache-aware data structures: when stressing the heap, all its content will be
> located in the cpu caches ... so you won't necessarily see the benefit of the
> cache-awareness, since the data of the non-cache-aware data structure are still
> in the CPU cache. unless you manually add some work to thrash the caches ... in
> your case, you will have a maximum of 90000 items in the queue, on 32bit
> platforms this is rougly 1.4mb ... you won't see a big effect, if this fits all
> into the caches ...
The data cache sizes on the test systems are I think:
1. AMD Athlon II X2: 64 kB L1 + 1 MB L2.
2. VIA C7-M: 32 kB L1 + 128 kB L2.
3. ARM v5 (Marvell Kirkwood): 16 kB L1 + 256 kB L2.
So this data set really shouldn't fit in the cache. Even if it did, it
would be great if the b_heap implementation performed no worse than the
Also I've looked at your own benchmark graphs and it seems that there
is a fairly constant performance difference between the different heaps
- I don't see b_heaps getting faster than the alternatives as the queue
size increases. (BTW, I find it almost impossible to distinguish the
colours in those graphs...)