I tried the mapped_matrix, and it performed as bad as compressed_matrix. I can't understand why because everyone says it is faster and I could see it myself in a simple app I wrote, like that sparse fill samples. I can't understand what is wrong with my code. Guys recommended me to use a generalized_vector_of_vector to build the global stiffness matrix and then copy it to the compressed_matrix (which is a good choice for multiplications later), then I got something like this http://codepad.org/1Jwx1Lgb.

Just to explain that code better, t->getGlobalIndex(j) and t->getCorotatedStiffness0(j, k) are just regular gets, nothing special happens inside them. t->computeCorotatedStiffness() is not something computationally expensive, it performs one Gram-Schimdt orthonormalization on a 3x3 matrix and then 32 3x3 matrix multiplies (not using ublas matrices there). No matter what matrix I use for RKR_1 and RK I get the same poor performance. One proof that makes the code slow is that if I comment those two lines where I perform the element wise sum, one of my samples runs 15-20 times faster (in this specific sample, m_tetrahedrons.size()==617). Then, I'm really lost there, I wouldn't like to just throw all my work with ublas away because of this. There must be a solution to this problem.

Rui,

Hey that eigen looks awesome, very promising. If I don't get to solve this problem with ublas soon, I may give it a try. Thanks for the suggestion.

Thanks guys,

x