From: Vadim Zborovskii (vadim_z_at_[hidden])
Date: 2006-10-10 09:05:37
In order to multiply the row of matrix in compressed-row-storage (CRS)
format by vector, we need to find offset for this row in the indices
array. Then for each index after this offset and before the end of row,
we need to multiply matrix element with this index by vector element and
sum up these products.
So in the inner loop of matrix-by-vector product calculation the
offsets of indices for the given row is the loop invariant and thus
shouldn't be computed.
In the Fortran90 this is done straightforward
(code is taken from book by Y.Saad "Iterative methods for sparse linear
I'd like to know whether this kind of optimization is done for uBlas.
I've compiled the corresponding code with gcc
using -O2 and -DNDEBUG but the resulting assembler code seemed too
complex for me. So I suspect that the variables analogous to K1 and K2
are computed in the inner loop too. Am I right ? Is this optimization
done by compilers other than gcc?