Boost logo

Proto :

Subject: Re: [proto] proto performance
From: Karsten Ahnert (karsten.ahnert_at_[hidden])
Date: 2011-02-25 02:52:06


> MacBook Pro, 10.6.6, Core 2 Duo
> ProtoContext ProtoTransform ProtoLambda Loop
> GCC 4.2.1 (Apple) : 5.3565438 5.3721942 126.38458 1.3657978
> GCC 4.4.5 : 1.8878364 1.8845548 70.056237 0.942303
> GCC 4.5.2 : 1.8840608 1.889619 1.2806688 1.0589558
> GCC 4.6.0 (2/5/11): 1.8854768 1.8834438 1.278347 1.2345208
> CLANG 2.9 (125472): 5.455976 5.4627628 3.825104 1.2330524
>
> Now, removing the ((noinline)), gives (in the same order)
>
> GCC 4.2.1 (Apple) : 4.1448478 5.3795842 126.53211 1.3215378
> GCC 4.4.5 : 1.2505956 1.2500816 69.409665 0.7198288
> GCC 4.5.2 : 0.596143 0.7213138 0.71969283 0.7211534
> GCC 4.6.0 (2/5/11): 1.2942638 1.4324828 0.646147 0.6632324
> CLANG 2.9 (125472): 1.2975226 1.2966478 1.3849834 1.2452362

Interesting results. I have done a similar test for loops (for, while,
with/without pointers) and obtained similar results. Everything depends
on the compiler.

I think the order of the above numbers will drastically change if the
expression is small, like x3 = x1 + 2.0 * x2.

> I'm not sure how meaningful this second set of numbers is. If the evaluation functions are inlined, the compiler
> can realize that evaluating them num_of_steps times is unnecessary since the data isn't changing between
> iterations. It then (I believe) optimizes out certain parts of the loop in certain cases.

Maybe it would be better to evaluate something with the increment assign
operator, x3 += x1 + 2.0 * x2.


Proto list run by eric at boostpro.com