Hi Markus,

First off, I think it might be a good idea to turn off debugging by defining:

#define NDEBUG

next, if you can assume no shared memory between the left and right side of the equation, then you can use noalias

noalias(v) = prod(u,M);

What are the time results with these changes?

Ian

-----Original Message-----
From: ublas-bounces@lists.boost.org [mailto:ublas-bounces@lists.boost.org]On Behalf Of Markus Weimer
Sent: Friday, June 29, 2007 3:53 AM
To: ublas mailing list
Subject: [ublas] vector * Matrix is slow compared to hand written code

Hi,

we just did some experiments on the following case, which occurs often in our code:

v = prod(u,M)

where v and u are dense vectors and M is a sparse matrix in compressed row major format.

We also did an alternative implementatio of prod called ourProd in the attached code. It follows the following algorithm:

for i in M.size1():
v += u[i] * M[i,*]

where M[i,*] equals row i of M.

This version often is 10-20 times faster than the prod in uBLAS on sparse M. While this is all good, the implementation in test_fast() in the attached source is sometimes 1000 times faster on very sparse matrices, for example when running the code with the following parameters:

   executable 100000 10000 1 0.001

The parameters are as follows:

   executable ROWS COLS HOWOFTEN NONZEROPROBABILITY

where HOWOFTEN controls how often the experiments are run.

Do you see any way to achieve better performance within uBLAS for prod(u,M) with M being very sparse, as in 1 out of 1000 entries are set.

Thanks in advance,

Markus