Subject: Re: [boost] Proposal: Monotonic Containers - Comparison with boost::pool, boost::fast_pool and TBB
From: David Bergman (David.Bergman_at_[hidden])
Date: 2009-06-19 12:36:08
Beside the use cases where allocations and deallocations are included,
I am interested in the "cache effects," i.e., how the rather focused
locus of a (small) monotonic block compares to other allocation schemes.
So, if Christian could run some tests where the allocations are
reserved (quite literally for vector, but one would have to
prepopulate other containers when using the standard allocator) and
only *accesses* to those elements are measured. We all understand that
an increment of a pointer (possibly with proper alignment arithmetics)
and a no-op deallocation is fast. No need to test or benchmark that
per se, in my opinion. Let us instead look at the cache effects!
On Jun 19, 2009, at 12:30 PM, Simonson, Lucanus J wrote:
> Christian Schladetsch wrote:
>>> Updated tests (more of them, better formatting of results) for
>>> here http://tinyurl.com/m83vll for GCC and here
>>> http://tinyurl.com/n9g8jv for
>>> MSVC. I also added a column for std::allocator/mono.
>> Have added comparison against TBB, as requested. These results are at
>> the same location as linked to above.
> Note tbb is pretty much the fastest established allocator. It does
> reuse memory and deallocation is not a no-op, so it is more general
> than monotonic, which must be scoped and cannot be used when volume
> of memory allocations greatly exceeds the maximum ammount of memory
> actually in use at any given time. In windows, monotonic is almost
> a wash with tbb, whereas in linux the difference between montotinic
> and tbb is not that great and hard to evaluate because the
> measurements lack precision.
> The measurements shown are often only accurate to one significant
> digit, or don't register at all. Either get a more accurate timer
> or run larger benchmarks. Also, it is conventional to report
> speedup as a factor, rather than a percent. Instead of saying
> monotonic is 150% faster than std allocator you should report 1.5X
> as fast. This way instead of saying it is 1.9e+004% faster than
> fast_pool for thrash_pool_sort with length 510 you would say it is
> 190X faster, which just makes more sense all around. Also it makes
> more sense to say 0.95X as fast than 95% faster because it is more
> clear that values less than one are bad. People will trust
> benchmark results more if they are reported in a conventional manner.
> I'd suggest adding the google allocator to the list as well. You
> can expect it to perform about the same as tbb since that is what we
> have found in our testing.
> These benchmarks look much better than what you've shown previously,
> but I'd like you to also look for industry standard bencharks for
> allocators, or industry standard benchmarks that you can use to
> compare allocators. It is obvious to me that monotonic should not
> be used in certain ways it wasn't designed for, like for all
> allocations by a long running program without ever freeing or
> reusing any memory it has allocated, but it is not obvious from your
> benchmarks. I'd like you to also compare peak memory in the
> benchmark results and not just runtime, since this is often what
> people care about when they are optimizing. If you only have 256GB
> of memory and your can't afford to swap you find yourself
> implementing all kinds of things differently than if you have a
> terabyte. People need to understand the drawbacks of your
> allocator, not just the advantages, and benchmarks showing high peak
> memory and swapping when the allocator is used in a context it
> wasn't intended for shoul
> d make that clear.
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost