Boost logo

Boost :

From: C. Green (postmast.root.admi.gov_at_[hidden])
Date: 1999-09-04 15:57:53


At 10:12 AM -0700 1999.09.03, Reid Sweatman wrote:
>> My notes have his graphs indicating a 2X performance improvement
>> when using the Blitz++ to C translator, compared to using
>> vanilla templates.
>
>To me, that sounds like the current crop of C++ compilers just don't have
>adequate optimizers (well, we know they all fall down in one area or
>another). Since templates have become sort of standard only lately, I guess
>I'm not too surprised that most C++ compilers don't optimize them well.

What exactly is there to optimize with regard to templates? All I can think
of is the space optimization of merging duplicate code together (with
appropriate thunks to preserve the unique addresses of the unmerged
functions). Maybe you're referring to some brain-dead compilers that
disable certain features (like inlining) when templates are used. I sure
hope that current compilers don't have these issues anymore.

>But
>there shouldn't be any reason they *couldn't*, if the compiler writers would
>focus their attention on the problem. The mere fact that the guys at
>Argonne could write such a translator and get a 2X performance boost is
>pretty well proof of that;

This is understandable. Compilers I think still suck at doing optimizations
related to keeping an object's bits in a register (if it has just one
member), and optimizing out extra creation and destruction of objects. This
isn't particular to templates, though, so you'd run into the same trouble
with hand-written classes that presented the same abstraction. I think
there is a test (maybe it uses the STL too) called the "abstraction
penalty" (ouch). It's good to know that this penalty can be zero, and I
expect it to get close to that in a few years. There are many optimizations
that aren't that difficult that will provide significant reduction in
generated code.

>I don't really think that the standard C++ speed
>boojums like vtable overhead could be accounting for that kind of speed
>loss.

Especially since Blitz++ likely doesn't use virtual functions :-)

If it did, this type of speed difference wouldn't surprise me at all since
there is a significant amount of abstraction as far as the total number of
function calls, and the actual work being done for these tests is likely
very small (simple built-in math operations). On compilers that don't do
loop invariant optimizations on virtual functions called within a loop,
virtual calls can be fairly significant if the work being done in the body
of the function is small.

[snip]


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk