|
Boost : |
Subject: Re: [boost] [xpressive] Performance Tuning?
From: Matthias Troyer (troyer_at_[hidden])
Date: 2009-08-03 17:45:47
On 28 Jul 2009, at 11:46, Edward Grace wrote:
>
> On 28 Jul 2009, at 18:17, Joel de Guzman wrote:
>
>> Edward Grace wrote:
>>
>>>> That is a *lot* more reasonable, although Spirit is still most
>>>> definitely faster then the built-in functions. :)
>>>
>>> That's good though - one up for Boost!
>>
>> My latest benchmarks for integers and floating points reveal
>> a 3x speed over atoi/strtol and friend C functions. You
>> mentioned a need to parse small numbers very quickly? Spirit
>> does. The tests I have take that into consideration too. If you
>> guys want to take a peek, it's in the Boost trunk in libs/benchmarks.
>
> Sure. Can you give me an exact url? The last time I went on a code
> hunt in SVN I found the wrong thing.
>
>> Some numbers:
>>
>> ///////////////////////////////////////////////////////////////////////////
>> atoi_test: 0.9265067422 [s] {checksum: d5b76d60}
>> strtol_test: 1.0766213977 [s] {checksum: d5b76d60}
>> spirit_int_test: 0.3097019879 [s] {checksum: d5b76d60}
>>
>> ///////////////////////////////////////////////////////////////////////////
>> atof_test: 7.3012049917 [s] {checksum: 3b7d82b0}
>> strtod_test: 8.0042894122 [s] {checksum: 3b7d82b0}
>> spirit_double_test: 2.6729373333 [s] {checksum: 3b7d82b0}
>>
>> This time, I am using the benchmarking harness by David Abrahams,
>> Matthias Troyer, Michael Gauckler.
>
> This?
>
> http://tinyurl.com/kk858o
>
> There's some interesting trickery in there by the looks of things
> for eliminating the optimiser nastiness - that's not something I've
> thought about much I'll take a look.
>
> In the comments,
>
> 42 // operation to at least update the L1 cache. *** Note: This
> 43 // concern is specific to the particular application at which
> 44 // we're targeting the test. ***
>
> that seems quite important but a little opaque out of context.
We did a loop over multiple accumulators in this test since this was
the application scenario. We wanted to measure the abstraction penalty
of using the Boost.Parameter library, but wanted it in a scenario
where the functions called at least access the L1 cache so that we are
not influenced by non-realistic too much simplified code that might be
optimized too much.
> One thing I take exception to is the (effective) use of the mean as
> a measurement of central tendency - perhaps their trickery has
> eliminated the heavy tail. I'll have to take a look and see how it
> compares to my approach.
Yes, we wanted to eliminate irrelevant heavy tails mainly by running
the test multiple times. Why should the cost of e.g. swapping Word and
Excel out of main memory to get space to run your test be measured as
part of your test? This is a cost that *any* program might have to pay
at times, no matter which algorithm you are comparing. I do not want
such extreme events that are outside of my program's control mess up
the comparisons.
Another comment about heavy tails: if they are there then I want to
analyze them and understand them. If they are absent then the mean is
fine. We typically run the benchmark multiple times to eliminate heavy
tails caused by swapping, etc..
> How do your relative timings compare if you repeat them while (say)
> watching a DVD? [*]
Again, if you carte about performance of codes while you wach a DVD
then that is just the benchmark you should run. My codes typically run
while I am not watching a DVD and hence I do not repeat them while
watching a DVD.
> [*] This may seem a perverse question. I'm interested in robust
> performance measurement, in other words accurately working out which
> function is fastest while the machine is under choppy loading -- not
> a sanitised testing environment - so the fastest function can be
> selected by the code itself.
Again, my codes, and most performance-sensitive codes that I know of
run in a rather "sanitized" environment without users watching DVDs or
playing games on the machine while they run.
If you want tests under choppy load then you have to provide a
"sanitized" and reproducible choppy load environment.
Matthias
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk