|
Ublas : |
From: Gunter Winkler (guwi17_at_[hidden])
Date: 2008-01-22 17:08:12
Am Dienstag, 22. Januar 2008 22:58 schrieb James Sutherland:
> I have run this using compressed_matrix and find that the application
> is still 5x slower than my current implementation, which is not
> hand-tuned.
>
> I am using the following flags with g++ 3.4.6
> -04 -fexpensive-optimizations -funroll-loops -DNDEBUG
>
> Running gprof shows that
> ublas::vector_assign
> consumes 40% of the execution time, while
> ublas::compressed_matrix::find2
> consumes 35% of the total time.
This looks like you are not using axpy_prod (see operation.hpp)
Without further information what kind of matrix/vector operations you
need we can not give any hints. The general optimizations are
* use noalias where possible
noalias(x) = y + z;
* use axpy_prod where possible
axpy_prod(A,x,y,false); // y += prod(A,x)
axpy_prod(A,x,y,true); // y = prod(A,x)
* use iterators instead of loops and operator(i,j)
mfg
Gunter