Subject: Re: [ublas] prod function usage
From: Umut Tabak (u.tabak_at_[hidden])
Date: 2011-05-24 04:58:35
On 05/24/2011 09:50 AM, Ungermann, Jörn wrote:
> For ublas::compressed_matrix-ublas::vector products, the answer is
> axpy_prod. This is already very fast, if you use a row-major ordering.
> However, we use a hand-tuned implementation based on axpy_prod that does
> manual loop-unrolling, uses SSE instruction and uses prefetch (~30-40%
> speedup in our main use-case). I doubt that you could get a larger speedup
> with a commercial solution. For dense matrices however, one should look for
> external solutions like ATLAS of the MKL.
I am grateful for the explanations.
> Looks quite good. You probably want to cache the value of inner_prod(q, Mq),
> so you do not have to recalculate it repeatedly in the innermost loop.
> Second, the assignment of "basis[k] = ..." uses a temporary copy of the
> vector, which you can get rid of by "basis[k].minus_assign(r * q);" or
> "noalias(basis[k]) -= r * q;".
Thanks for these hints as well.
> The return instruction seems also unnecessary, as it explicietly copies an
> element of basis, which is return any way by reference.
Indeed yesterday I got rid of this.