Boost logo

Boost Users :

From: Matthew Wilson (stlsoft_at_[hidden])
Date: 2003-11-21 00:18:27


> >... Thankfully, I've had encouraging responses on the
> >developer group, and the code is now posted there.
>
> Matthew,
>
> Because the discussion of how to achieve competitive boost::shared_ptr
> performance moved to the developers list, it may have left readers of the
> users list with misimpressions about performance. Perhaps you could post a
> summary of your newer results with the suggestions from the developers
list
> applied.

Sure thing.

Everyone, this is the body of the post from the .devel group. I'm not
reposting the zip. You can pick that up from the other group.

Cheers

Matthew

"
>> Would it be possible for you to share the source of the benchmarks?
> >
> > Certainly. What would you advise: would it be ok to just send to you
> > via email, or do you want me to post the whole hideous heap here?
> > There is a certain amount of swill to be had.
> >
> > 1. There's a lot of STLSoft stuff in there, for the timings, and
> > whatnot. (I don't think that's swill, of course, but still there's a
> > fair amount of stuff needed)
> > 2. There's a fair bit of very old, and *very* swillsome, Synesis
> > code. If I let you have it, you must promise not to consider it
> > representative of anything other than another lifetime, where both
> > programmers and compilers were a lot less intelligent.
>
> Well. I've been thinking about including the tests in Boost, in
> libs/smart_ptr/test, where currently shared_ptr_timing_test.cpp,
> shared_ptr_alloc_test.cpp and shared_ptr_mt_test.cpp reside.

Not clear. Do you mean my tests? That would be fine.

> When
> benchmarking it is common courtesy to make the source publicly available
so
> that the results can be reproduced by others.

Let me make it clear, this wasn't an exercise set out for Boost bashing. I
was simply investigating whether there were significant performance
differences between internal and external reference-counting. I sought to
use boost::shared_ptr because it's popular.

Naturally I understand the need to share the source but, as an infrequent
user of this ng, I did not want to be presumptuous and post when that might
not be "the done thing". Furthermore, the results were initially so bad that
I did not trust them, and saw no point in making myself look a fool and
distracting others. Your advice on the quick_allocator and the new
shared_count has actually pretty much done the trick, and shared_ptr is now
pretty much on a par with the rest. (See end of post for final results)

> If you have the time, it would probably be best if you could try the Boost
> tests first, to see if the results are consistent with your own. When
there
> is a difference, we should probably use one of the Boost tests as a
> threading/timing framework and just replace the scenario being tested, to
> avoid the Synesis/STLSoft dependencies.

That just comes down to time, and at the moment I have precious little of
it. I've put in a conditional compilation to omit the Synesis stuff, but I
use the WinSTL/UNIX performance_counter classes, and getting rid of them
would mean I'd have to plug in a lot of custom code to measure elapsed and
thread-times, and I don't have the time at the moment. Hey, maybe Boost
would be interested in having the performance_counters? There is a UNIX one
as well, and I'm sure it would be trivial for your other-OS-experts to add
the requisite ones for their platforms of choice.

I'm posting a zip with the source files, an Intel makefile (I use Borland
make, but I'm pretty sure any other make will do), and a minimum isolated
set of STLSoft files needed to build them. To build using the enhancements,
define the make symbol USE_BOOST_DIMOV_MEASURES, which causes the new
shared_count.hpp to be included, and also defines
BOOST_SP_USE_QUICK_ALLOCATOR to the compiler.

Included below are the final results, without the Synesis tests, run on my
machine, with and without USE_BOOST_DIMOV_MEASURES, for 100,000 and
1,000,000 iterations. It's clear that the "measures" address the stark
performance disparities in multi-threaded builds, and genuine multi-threaded
processes, at a small cost in single-threaded builds. No doubt that could be
handled with suitable context discrimination.

I hope that's enough information.

Cheers

Matthew

Without Dimov Measures
======================

shared_ptr_test: Intel C/C++ - discarding pointers - single-threaded 100000
iterations
Ext RC (boost::shared_ptr<Thing>): 139
Ext RC (SharedPtr<Thing>): 127
Ext RC (SharedPtr<Thing> + pool): 96
Ext RC (SharedPtr<Thing> + pool2): 88

shared_ptr_test: Intel C/C++ - saving pointers - single-threaded 100000
iterations
Ext RC (boost::shared_ptr<Thing>): 269
Ext RC (SharedPtr<Thing>): 226
Ext RC (SharedPtr<Thing> + pool): 228
Ext RC (SharedPtr<Thing> + pool2): 225

shared_ptr_test: Intel C/C++ - discarding pointers - multi-threaded 100000
iterations
Ext RC (boost::shared_ptr<Thing>): 410
Ext RC (SharedPtr<Thing>): 245
Ext RC (SharedPtr<Thing> + pool): 216
Ext RC (SharedPtr<Thing> + pool2): 221

shared_ptr_test: Intel C/C++ - saving pointers - multi-threaded 100000
iterations
Ext RC (boost::shared_ptr<Thing>): 713
Ext RC (SharedPtr<Thing>): 506
Ext RC (SharedPtr<Thing> + pool): 519
Ext RC (SharedPtr<Thing> + pool2): 595

shared_ptr_thread_test: Intel C/C++ elapsed thread
Ext RC (boost::shared_ptr<Thing>): 7441 2196
Ext RC (SharedPtr<Thing>): 895 785
Ext RC (SharedPtr<Thing> + pool): 1190 709
Ext RC (SharedPtr<Thing> + pool2): 2480 1017

shared_ptr_test: Intel C/C++ - discarding pointers - single-threaded 1000000
iterations
Ext RC (boost::shared_ptr<Thing>): 1427
Ext RC (SharedPtr<Thing>): 1279
Ext RC (SharedPtr<Thing> + pool): 947
Ext RC (SharedPtr<Thing> + pool2): 884

shared_ptr_test: Intel C/C++ - saving pointers - single-threaded 1000000
iterations
Ext RC (boost::shared_ptr<Thing>): 2545
Ext RC (SharedPtr<Thing>): 2292
Ext RC (SharedPtr<Thing> + pool): 2321
Ext RC (SharedPtr<Thing> + pool2): 2252

shared_ptr_test: Intel C/C++ - discarding pointers - multi-threaded 1000000
iterations
Ext RC (boost::shared_ptr<Thing>): 4159
Ext RC (SharedPtr<Thing>): 2452
Ext RC (SharedPtr<Thing> + pool): 2200
Ext RC (SharedPtr<Thing> + pool2): 2239

shared_ptr_test: Intel C/C++ - saving pointers - multi-threaded 1000000
iterations
Ext RC (boost::shared_ptr<Thing>): 7427
Ext RC (SharedPtr<Thing>): 5107
Ext RC (SharedPtr<Thing> + pool): 5243
Ext RC (SharedPtr<Thing> + pool2): 5373

shared_ptr_thread_test: Intel C/C++ elapsed thread
Ext RC (boost::shared_ptr<Thing>): 302511 21712
Ext RC (SharedPtr<Thing>): 63595 7632
Ext RC (SharedPtr<Thing> + pool): 46783 7942
Ext RC (SharedPtr<Thing> + pool2): 158324 10130

With Dimov Measures
===================

shared_ptr_test: Intel C/C++ - discarding pointers - single-threaded (+
boost quick allocator) 100000 iterations
Ext RC (boost::shared_ptr<Thing>): 191
Ext RC (SharedPtr<Thing>): 128
Ext RC (SharedPtr<Thing> + pool): 94
Ext RC (SharedPtr<Thing> + pool2): 89

shared_ptr_test: Intel C/C++ - saving pointers - single-threaded (+ boost
quick allocator) 100000 iterations
Ext RC (boost::shared_ptr<Thing>): 276
Ext RC (SharedPtr<Thing>): 206
Ext RC (SharedPtr<Thing> + pool): 209
Ext RC (SharedPtr<Thing> + pool2): 204

shared_ptr_test: Intel C/C++ - discarding pointers - multi-threaded (+ boost
quick allocator) 100000 iterations
Ext RC (boost::shared_ptr<Thing>): 236
Ext RC (SharedPtr<Thing>): 244
Ext RC (SharedPtr<Thing> + pool): 215
Ext RC (SharedPtr<Thing> + pool2): 222

shared_ptr_test: Intel C/C++ - saving pointers - multi-threaded (+ boost
quick allocator) 100000 iterations
Ext RC (boost::shared_ptr<Thing>): 410
Ext RC (SharedPtr<Thing>): 483
Ext RC (SharedPtr<Thing> + pool): 498
Ext RC (SharedPtr<Thing> + pool2): 573

shared_ptr_thread_test: Intel C/C++ elapsed thread
Ext RC (boost::shared_ptr<Thing>): 2035 876
Ext RC (SharedPtr<Thing>): 1197 800
Ext RC (SharedPtr<Thing> + pool): 1133 800
Ext RC (SharedPtr<Thing> + pool2): 4222 1066

shared_ptr_test: Intel C/C++ - discarding pointers - single-threaded (+
boost quick allocator) 1000000 iterations
Ext RC (boost::shared_ptr<Thing>): 1924
Ext RC (SharedPtr<Thing>): 1284
Ext RC (SharedPtr<Thing> + pool): 930
Ext RC (SharedPtr<Thing> + pool2): 863

shared_ptr_test: Intel C/C++ - saving pointers - single-threaded (+ boost
quick allocator) 1000000 iterations
Ext RC (boost::shared_ptr<Thing>): 2817
Ext RC (SharedPtr<Thing>): 2112
Ext RC (SharedPtr<Thing> + pool): 2161
Ext RC (SharedPtr<Thing> + pool2): 2100

shared_ptr_test: Intel C/C++ - discarding pointers - multi-threaded (+ boost
quick allocator) 1000000 iterations
Ext RC (boost::shared_ptr<Thing>): 2394
Ext RC (SharedPtr<Thing>): 2475
Ext RC (SharedPtr<Thing> + pool): 2180
Ext RC (SharedPtr<Thing> + pool2): 2249

shared_ptr_test: Intel C/C++ - saving pointers - multi-threaded (+ boost
quick allocator) 1000000 iterations
Ext RC (boost::shared_ptr<Thing>): 4181
Ext RC (SharedPtr<Thing>): 4925
Ext RC (SharedPtr<Thing> + pool): 5087
Ext RC (SharedPtr<Thing> + pool2): 5205

shared_ptr_thread_test: Intel C/C++ elapsed thread
Ext RC (boost::shared_ptr<Thing>): 48751 8990
Ext RC (SharedPtr<Thing>): 170580 8038
Ext RC (SharedPtr<Thing> + pool): 59856 7974
Ext RC (SharedPtr<Thing> + pool2): 172761 10163

"


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net