Hi
Markus,
First
off, I think it might be a good idea to turn off debugging by
defining:
#define NDEBUG
next,
if you can assume no shared memory between the left and right side of the
equation, then you can use noalias
noalias(v) = prod(u,M);
What
are the time results with these changes?
Ian
Hi,
we just
did some experiments on the following case, which occurs often in our
code:
v = prod(u,M)
where v and u are dense
vectors and M is a sparse matrix in compressed row major format.
We
also did an alternative implementatio of prod called ourProd in
the attached code. It follows the following algorithm:
for i in
M.size1():
v += u[i] * M[i,*]
where M[i,*] equals row i of
M.
This version often is 10-20 times faster than the prod in uBLAS on
sparse M. While this is all good, the implementation in test_fast() in the
attached source is sometimes 1000 times faster on very sparse matrices, for
example when running the code with the following parameters:
executable 100000 10000 1 0.001
The parameters are
as follows:
executable ROWS COLS HOWOFTEN
NONZEROPROBABILITY
where HOWOFTEN controls how often the experiments
are run.
Do you see any way to achieve better performance within uBLAS
for prod(u,M) with M being very sparse, as in 1 out of 1000 entries are set.
Thanks in advance,
Markus