
Ublas : 
From: Vadim Zborovskii (vadim_z_at_[hidden])
Date: 20061010 09:05:37
Dear colleagues!
In order to multiply the row of matrix in compressedrowstorage (CRS)
format by vector, we need to find offset for this row in the indices
array. Then for each index after this offset and before the end of row,
we need to multiply matrix element with this index by vector element and
sum up these products.
So in the inner loop of matrixbyvector product calculation the
offsets of indices for the given row is the loop invariant and thus
shouldn't be computed.
In the Fortran90 this is done straightforward
DO I=1,N
K1=IA(I)
K2=IA(I+1)1
Y(I)=DOTPRODUCT(A(K1:K2),X(JA(K1:K2)))
ENDDO
(code is taken from book by Y.Saad "Iterative methods for sparse linear
systems).
I'd like to know whether this kind of optimization is done for uBlas.
I've compiled the corresponding code with gcc
using O2 and DNDEBUG but the resulting assembler code seemed too
complex for me. So I suspect that the variables analogous to K1 and K2
are computed in the inner loop too. Am I right ? Is this optimization
done by compilers other than gcc?
Vadim Zborovskiy