|
Boost : |
From: Matthew Hurd (matt_at_[hidden])
Date: 2004-02-12 04:56:31
> On Behalf Of Matthew Hurd
> Subject: RE: [boost] [function] timing trivia
>
>
>
> > On Behalf Of Douglas Paul Gregor
> > Subject: Re: [boost] [function] timing trivia
> >
> > On Thu, 12 Feb 2004, Matthew Hurd wrote:
> > > Just completed some timings of the cost of a boost function call using
> a
> > > boost::function<void (void)> and a raw function pointer versus
> inlining
> > the
> > > function's code directly into the source.
> > >
> > > For my circumstances (your mileage may vary):
> > > Calling via raw function pointer = 314 microseconds +/- 1
> > > Calling via boost function variable = 328 microseconds +/- 1
> > >
> > > Only 14 microseconds of difference. Nice work, I was expecting much
> > more.
> > > Note: Optimised VC7.1 output, winXP 1a, Athlon 2800+ (2.09GHz)
> >
> > Looks like VC 7.1 is doing a good job optimizing this. I get asked about
> > the performance of function<> every once in a while; mind if I stick
> this
> > data in the FAQ?
> >
> > Doug
>
> Doug,
>
> These results are invalid.
>
> I think I am embarrassing myself here... my benchmarking code is reporting
> the function call overhead in proportion to the size of the loop internal
> to
> a function :-( I'm doing something wrong here, it might be the optimizer
> eliding some code due to it detecting an unused variable.
>
> I'll investigate and report some results I'm more confident in.
>
> Regards,
>
> Matt Hurd.
I've been staring at measurements and double checked many things, but I'm at
a loss to explain what I see.
The good news is I'm confident the overhead of boost function with a static
method call or a plain function for optimised code is within 60 nanoseconds
of embedding the code itself for code that is a tight loop irrespective of
the size of the loop for my circumstances. Importantly, sometimes
boost::function measures faster to the resolution of my high resolution
timer.
That is, for the optimized case, I measure no consistent discernable
abstraction penalty at all.
Lemma: boost::function is beautiful
What am I measuring? This method, or the equivalent function...
static double not_empty()
{
static double sum;
static double i;
sum = 0.0;
for (i = 0.0; i < MAX_FN_LOOP ; ++i)
{
sum += i * i;
}
return sum;
}
I see some strange behaviour for unoptimised debug code (median of 1000
measurements):
Iterations in the
"internal loop of the cost in microseconds of
function" boost::function<double (void)>
--------------------------------------------------------------
1 0.182
10 0.196
100 0.183
1,000 0.059
10,000 -1.176
100,000 -14.767
1,000,000 +67.110
10,000,000 -1,131.9
Can't explain it. Boost function can't be faster for the 10 million loop by
more than a millisecond. I see none of this whacky behaviour with release
optimized code.
The code is pretty straight forward at the core.
double answer1 = 0.0;
for (size_t j = 0; j< num_trials; ++j )
{
trial.restart();
answer1 += fn();
now = trial.elasped();
timing1[j] = now;
}
double answer2 = 0.0;
for (size_t j = 0; j< num_trials; ++j )
{
trial.restart();
static double sum;
static double i;
sum = 0;
for (i = 0; i < MAX_FN_LOOP ; ++i)
{
sum += i * i;
}
answer2 += sum;
now = trial.elasped();
timing2[j] = now;
}
Doug, I hope this says something, but the strangeness leaves me uneasy.
Regards,
Matt Hurd.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk