Boost logo

Ublas :

Subject: Re: [ublas] Element-wise operations are really slow
From: Vardan Akopian (vakopian_at_[hidden])
Date: 2010-04-22 00:55:00

On Wed, Apr 21, 2010 at 6:55 PM, xiss burg <xissburg_at_[hidden]> wrote:

> Sunil,
> I tried the mapped_matrix, and it performed as bad as compressed_matrix. I
> can't understand why because everyone says it is faster and I could see it
> myself in a simple app I wrote, like that sparse fill samples. I can't
> understand what is wrong with my code. Guys recommended me to use a
> generalized_vector_of_vector to build the global stiffness matrix and then
> copy it to the compressed_matrix (which is a good choice for multiplications
> later), then I got something like this
> Just to explain that code better, t->getGlobalIndex(j) and t->
> getCorotatedStiffness0(j, k) are just regular gets, nothing special
> happens inside them. t->computeCorotatedStiffness() is not something
> computationally expensive, it performs one Gram-Schimdt orthonormalization
> on a 3x3 matrix and then 32 3x3 matrix multiplies (not using ublas matrices
> there). No matter what matrix I use for RKR_1 and RK I get the same poor
> performance. One proof that makes the code slow is that if I comment those
> two lines where I perform the element wise sum, one of my samples runs 15-20
> times faster (in this specific sample, m_tetrahedrons.size()==617). Then,
> I'm really lost there, I wouldn't like to just throw all my work with ublas
> away because of this. There must be a solution to this problem.

Are you saying that it's slow even if you use simple matrix<double>? Are you
using high enough level of optimization flags to make sure inline code
actually gets inlined? Do you have NDEBUG defined? If yes to all the above,
then the slowness is not in ublas. Perhaps you should try to profile your
application and see where the real bottleneck is. Or come up with a simple
enough and self contained example, that we can run and test too.