Boost logo

Boost :

Subject: Re: [boost] [xpressive] Performance Tuning?
From: Simonson, Lucanus J (lucanus.j.simonson_at_[hidden])
Date: 2009-07-30 13:26:33

Thorsten Ottosen wrote:
> Edward Grace skrev:
>>>> I'm clearly going to have to ponder this in some depth. While I've
>>>> thought about trying to get the maths (statistics) right, I've not
>>>> really given the nitty gritty of the machine's operation a great
>>>> deal of thought.
>>> Well, there you have it. I'd love to have your expertise on
>>> statistics
>> I claim none! I've been trying to learn as I go - perhaps a little
>> knowledge is a dangerous thing!
>>> plus Matthias Troyer, et al. test harness be combined in an easy to
>>> use benchmarking library.
>> Be careful what you wish for, it might come true... ;-)
>>> Benchmarking is such a black art :-) !
>> I totally agree! Like presumably many others, this appeared to me to
>> be a trivial problem - at first sight. In fact it's anything but
>> straightforward. I bet there are half a dozen or so PhDs sitting in
>> this particular dark recess of computing.
> Well, I normally use a tool like vtune running the code on real data.
> Do
> you think such a tool is unreliable?
VTune is for tuning application performance. VTune is too big a hammer for measuring benchmark runtime. Benchmarks are about wall clock time of a piece of code as a metric for performance of something else. How you measure that time and how you report those measurements is a problem that is prone to error.

Personally I perform several runs of the same benchmark and then take the minimum time as the time I report. This excludes all outliers due to OS and such. If a car company reported the average (mean or median) 0 to 60 mph for a given car when their test driver was trying to see how fast it was everyone would think they were crazy. Why should the fact that he pushed the gas too hard on some of the trials and spun out and not hard enough on others count against the car? I also typically don't have the luxury of benchmarking on an unloaded machine, so I have to make sure to do things fairly, for instance, instead of runing many trials of A then many trials of B and taking the minimum for each I run many trials of A then B back to back and take the minimum. That way A and B on average see the same environement over the course of the experiement. If I run A and B in the same process I run A first then B and then B first and then A as a separate run to ensure that the order doesn't impact their performance.

Usually benchmarking implies comparision, so the key concern is that the results are fair.


Boost list run by bdawes at, gregod at, cpdaniel at, john at