Boost logo

Ublas :

Subject: [ublas] axpy prod seems to be slow
From: Oswin Krause (Oswin.Krause_at_[hidden])
Date: 2011-09-06 15:26:22


Hello list,

axpy_prod seems to be way slower than prod in the following code snippet:

//the following is called 100 times during a run of the program
ublas::matrix<double> weightMatrix(32,16);//filled with some values
ublas::vector<double> hiddenInput(32);
RealVector stateVector(16);
double z = 0;
for (std::size_t x = 0; x < 65536; x++) {
   //generates the x-th state vector
   state(stateVector,x);

   //Version(1)
   hiddenInput.clear();
   axpy_prod(weightMatrix, stateVector, hiddenInput,false);
   //Version(2)
   //noalias(hiddenInput) = prod(weightMatrix, stateVector);

   z+=foo(stateVector,hiddenInput);
}
The complete program is a bit bigger. version(1) has a runtime of 25
seconds, Version(2) only 11. when I turn of this code, i get ~6 seconds in
both versions. Also foo(...) is quite an expensive operation 832
exponential function calls). All together Version(2) seems to be around
facotr ~4-5 faster. Which is odd because axpy_prod is advertised as
optimized function call.

My compiler is gcc 4.61 and my boost version is 1.47.
I compile with -O3 and -DBOOST_UBLAS_NDEBUG

Can someone tell me where my mistake is? is axpy_prod only optimized for
big matrices? The code in operation.hpp (i think the version on line
127ff) looks quite reasonable to me.

Greetings,
Oswin