Boost logo

Ublas :

From: choon (teochoonhui_at_[hidden])
Date: 2007-06-29 12:20:37


hi nico,

you can achieve the same result by a a series of scaling and sum of rows
like pointed out by markus:

> for i in M.size1():
> v += u[i] * M[i,*]

i believe the procedure above does not cause memory trashing, so, the
question is whether there is a specialized prod for this left-hand
multiplication in the case of row-major matrix?

thanks,
choon

Nico Galoppo wrote:
>
> Also, you hit the nail on the head when you said 'row-major' storage. C =
> u * M
> is basically a bunch of dot products with *columns* of M. So you may run
> into
> cache trashing issues as you are continuously accessing non-coherent
> memory.
>
> --nico
>
> Ian Fellows wrote:
>> Hi Markus,
>>
>> First off, I think it might be a good idea to turn off debugging by
>> defining:
>> #define NDEBUG
>>
>> next, if you can assume no shared memory between the left and right side
>> of the equation, then you can use noalias
>>
>> noalias(v) = prod(u,M);
>>
>>
>>
>> What are the time results with these changes?
>>
>>
>> Ian
>>
>> -----Original Message-----
>> *From:* ublas-bounces_at_[hidden]
>> [mailto:ublas-bounces_at_[hidden]]*On Behalf Of *Markus Weimer
>> *Sent:* Friday, June 29, 2007 3:53 AM
>> *To:* ublas mailing list
>> *Subject:* [ublas] vector * Matrix is slow compared to hand written
>> code
>>
>> Hi,
>>
>> we just did some experiments on the following case, which occurs
>> often in our code:
>>
>> v = prod(u,M)
>>
>> where v and u are dense vectors and M is a sparse matrix in
>> compressed row major format.
>>
>> We also did an alternative implementatio of prod called ourProd in
>> the attached code. It follows the following algorithm:
>>
>> for i in M.size1():
>> v += u[i] * M[i,*]
>>
>> where M[i,*] equals row i of M.
>>
>> This version often is 10-20 times faster than the prod in uBLAS on
>> sparse M. While this is all good, the implementation in test_fast()
>> in the attached source is sometimes 1000 times faster on very sparse
>> matrices, for example when running the code with the following
>> parameters:
>>
>> executable 100000 10000 1 0.001
>>
>> The parameters are as follows:
>>
>> executable ROWS COLS HOWOFTEN NONZEROPROBABILITY
>>
>> where HOWOFTEN controls how often the experiments are run.
>>
>> Do you see any way to achieve better performance within uBLAS for
>> prod(u,M) with M being very sparse, as in 1 out of 1000 entries are
>> set.
>>
>> Thanks in advance,
>>
>> Markus
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> ublas mailing list
>> ublas_at_[hidden]
>> http://lists.boost.org/mailman/listinfo.cgi/ublas
>
> --
> Nico Galoppo UNC-CH PhD.student http://www.ngaloppo.org
> cell +1-919-360-5056
> _______________________________________________
> ublas mailing list
> ublas_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/ublas
>
>

-- 
View this message in context: http://www.nabble.com/vector-*-Matrix-is-slow-compared-to-hand-written-code-tf3999327.html#a11363014
Sent from the Boost - uBLAS mailing list archive at Nabble.com.