
Boost : 
Subject: Re: [boost] proposal  Statistically robust function timing for performance decisions.
From: Edward Grace (ej.grace_at_[hidden])
Date: 20090718 11:04:14
Dear All,
As suggested I have packaged up a roughnready version and stuffed
it in the vault.
It includes Doxygen generated documentation (HTML/PDF) and an example
(example_timer.cpp) which itself has copious comments.
Platform specifics (as far as I know) will be related to the
chronometer. You may have to wrap your own chronometer (e.g. using
something like gettimeofday() or getticks() from FFTW). I do not
recommend using std::clock() for the included example as it is simply
not precise enough  consequently you will grow bored waiting for it
to finish!
http://www.boostpro.com/vault/index.php?
action=downloadfile&filename=ejg_timer.zip&directory=Tools
On my Mac, the example can be built and run by unzipping the file and
doing the following:
$ cd ejg_timer/
$ g++ DNDEBUG O3 I. o example_timer ejg/example_timer.cpp
$ ./example_timer
Calibrating the clock, for a slow timer this may take a while!
Point estimate of clock overhead (t_c): 0.0986196
Potential jitter on t_c (sigma_c) : 0.633371
Increasing precision until I can discriminate between function a and b.
The nominal confidence interval (minimum, [median], maximum)
represents the region: 1alpha=0.95
Timing at a nominal precision of 50%: We cannot distinguish between A
and B: (1.62543 [0.339245] 1.78913)%
Timing at a nominal precision of 25%: We cannot distinguish between A
and B: (0 [0] 2.38656)%
Timing at a nominal precision of 12.5%: We cannot distinguish between
A and B: (0 [0] 0.69492)%
Timing at a nominal precision of 6.25%: We cannot distinguish between
A and B: (0 [1.84168] 3.71728)%
Timing at a nominal precision of 3.125%: We cannot distinguish
between A and B: (0.173508 [0.757778] 0.924288)%
Timing at a nominal precision of 1.5625%: A is faster than B by:
(0.563103 [0.690461] 0.940433)%
As you will note, there's a fair bit of extra cruft that has come
along for the ride, in particular I think it is worth being aware of:
striding_iterator  Perhaps boost could do with this as a separate
thing  it's pretty handy.
robust_linear_fit  A least absolute deviation robust linear fit,
used to calibrate the clock overhead. This is preferable to least
squares as it is insensitive to outliers.
statistics::ks_test  This implements a KolmogorovSmirnov two
sample test and is critical to answering the question 'have we taken
enough samples'.
wilcoxon_ci  Carry out a Wilcoxon signedrank test. Used to
determine confidence intervals on paired samples for the relative
timing.
I will take some time to carefully review my thought processes to
check that what I did is what I intended to do. There are some subtle
points which stretch my statistics understanding regarding the
repeated application of the ks_test  I suspect this is best
approached with some Baysian prior technique  that said what I have
seems to work well! Perhaps more by luck than judgement!
Comments and questions are welcome.
ed

"No more boom and bust."  Dr. J. G. Brown, 1997
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk