Boost logo

Boost :

Subject: Re: [boost] Proposal: Monotonic Containers - Comparison with boost::pool, boost::fast_pool and TBB
From: Christian Schladetsch (christian.schladetsch_at_[hidden])
Date: 2009-06-22 21:41:21


>
> Steven> The individual tests are okay, but I don't think that it's
>
particularly meaningful to average these very different
> tests even if you include the standard deviation.

I understand that they are very different tests, and that they mix node-
with sequence-based allocation. However, that was the actual point of the
benchmarks.

std:: and tbb:: do very well against monotonic, considering that they
actually also do de-allocation; I didn't cherry-pick the tests. Even so,
both std and tbb are at least ~50% slower across the board, and that is
siginifcant.

The fact that both std:: and tbb:: are quite consistent across both MSVC and
GCC, and both with a low standard deviation compared to monotonic, speaks
somewhat about the strength and meaning of the benchmark itself.

Monotonic allocation is 2-4X faster than std::allocator on average for these
benchmarks on MSVC, and 1.4X faster than TBB on average, which will be
important to those that want fast code on windows or 360. The spread for GCC
is also good with monotonic being fractionally faster again over TBB, but
here TBB isn't hugely better than std::allocator either.

I realise that at the end of the day, allocation is not a specific
bottle-neck in general. However, for small and temporary node-based
containers it take be a large chunk of your execution time. This is
especially true for real-time systems that have fixed high-water marks (such
as embedded systems, physics simulators and deferred renderers).

Many developers spend a lot of effort trying to shave 5-10% off of their
execution speed, after they have decided on the data-structures and usage
patterns. One motivation for monotonic was to help them with this.

Another, related, usage is to provide a temporary, general and *fast*
storage area that can start on the stack and can grow to the heap (and no,
you don't pay for that flexibility if you stay within the initial limit). In
some respects, a monotonic::storage<> instance is like a first-class scope
object.

> I doubt that [the tests] are anywhere close [to] resembling a random
> sample of actual usage.

I am open to suggestions on what other tests should be added, or how the
existing tests may be changed. I do supply a mean, and std-dev for each
test, as well as a summary at the end. I realise that the summary was skewed
for boost::fast_pool because it included the pathological case of map<int,
vector<int> > for it.

I seriously considered removing this test, on the basis that maybe it was
'unfair' to boost::fast_pool_allocator. But if it was to be a test of each
of the *allocator* types, in typical usage scenarios, it had to stay in. I
also didn't remove tests that monotonic did not win.

I picked some pretty basic usage patterns that I think do, in fact, somewhat
resemble a random sample of typical usage. Constructing a list, constructing
a vector. Duplicating a list, duplicating a vector. Sorting a list, sorting
a vector. Using a set of vectors and a map of ints to vectors (which also
covers strings somewhat).

I could've made tests that I knew monotonic would win hands down, but I
avoided doing that. There were other tests that I could have done that I
couldn't do because monotonic would exhaust memory. One of these would be
something like a spinning resize on a vector. monotonic::chain<> addresses
this specific issue (not to be confused with the 'chain' that is used for
heap-based storage), but because it was to be a test using std::containers,
I had to leave that test out.

A lot of similar third-party benchmarks just test the allocation speed.
boost::fast_pool is tremendously good at this for same-sized allocations.
Monotonic is even faster, and it is much faster again for random-sized
allocations.

But rather than just test allocation speed, I specifically wanted to
benchmark both allocation times *and* the performance of the structures
created by the different allocators, including node- and sequence-based
containers, and hybrids, for a wide range of sizes (from 1 to 10,000,000).

This is why, for example, I put in list creation (which monotonic wins)
along with list sort, which monotonic does not win over
boost::fast_pool_allocator.

If I had just left it at allocation, monotonic would win. But again, I
wanted to test the performance of the created data structures - in the case
of sorting a large list, monotonic does not win. But then again, it wins at
sorting a large vector which one can argue is more common and important.
Anyway, tl;dr.

In Christ,
> Steven Watanabe
>

Regards,
Christian


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk