Boost logo

Ublas :

Subject: Re: [ublas] prod function usage
From: Umut Tabak (u.tabak_at_[hidden])
Date: 2011-05-23 09:38:50


On 05/23/2011 02:56 PM, Ungermann, Jörn wrote:
> Hi Umut,
>
> if B is a simple diagonal (Tikhonov regularization) matrix, I'd make sure,
> that the diagonal elements are contained in A to begin with and simply add
> it. This should be reasonably fast. Otherwise, there are all kind of
> optimizations possible depending on how often you might want to do that and
> how the matrices look like.
>
>
Dear Jorn,

I really appreciate the detailed explanation, however there are some
points before going further. Here are the questions,

How can I learn the most efficient matrix-vector products as mentioned
in your text below, such as the overloaded gemv, and what are the most
efficient operations in ublas for a compressed_matrix type? And are they
really optimal or still not fast enough, my experience is not enough to
judge them, at least now?

I implemented a simple lanczos solver for generalized eigenvalue
problems, however the orthogonalizations with a template( to start with
) seems to go slow, I would like to learn if there are ways to speed up
these kinds of operations. The codes you sent me are one level up if
ublas does not do the job out of the box.

template < class Vector, class Mat >
Vector modified_gram_schmidt(std::vector< Vector > & basis, Vector & v,
Mat& M)
{
   basis.push_back(v);
   double r = 0;
   double norm;
   Vector q(v.size());
   Vector Mq(v.size());
   Vector z(v.size());
   for (int i = 0; i < basis.size(); ++i){
     norm = norm_2(basis[i]);
     q = basis[i] / norm;
     axpy_prod( M , q, Mq );
     for(int k=i+1; k<basis.size();++k){
       axpy_prod(M, basis[k], z);
       r = inner_prod(q, z)/
           inner_prod(q, Mq );
       basis[k] = basis[k] - r*q;
     }
   }
   return basis.back();
}

Any suggestions on this code snippet, by using the built-in ublas
facilities?

Second question is that can I gain sth if interface the matrices to
Intel MKL library(of course there is going to be an interface cost, but
maybe that could be well optimized doing that once in a routine or so),
do you have experience on this, I mean for sparse matrix--dense vector
multiplications?

Thanks and best regards,
Umut