|
Boost : |
Subject: Re: [boost] performance of a linear algebra/matrix library
From: DE (satan66613_at_[hidden])
Date: 2010-05-11 12:15:44
for those who still care
i investigated an "issue" which i claimed was due to abstraction
penalty
it seemed to me the error was in the source code...
indeed it was in the code... of my DNA
it turned out the performance differnece of 33% is due to loop
unrolling in the C code while the loop in the C++ code was not
unrolled
in other aspects the two loops were identical (abstraction in the C++
code completely optimized away)
so it's not the C++ code that run slow but the optimized C code that
run faster
i unrolled the loop in the C++ code manually and the two started to
run in the same time (it seems now it was a cpu pipeline issue)
furthermore i looked at icc11 generetaed assembly code and was shocked
icc not only optimized the abstraction away but also unrolled both
loops (the C and C++ ones) AND vectorized them
that is both plain C and C++ pieces of code were transformed into
instruction sequences like
movsd
movhpd
mulpd
movapd
//etc.
as a result icc generated code ran 15% faster than both msvc80 and
msvc10 verions
here a question arises:
since a compiler is able to generate very fast code involving simd
instructions is one supposed to provide simd-enabled implementation of
a generic library?
personally i think now that it is worthless
-- Pavel
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk