Ublas :

Subject: [ublas] Performance woes affecting ublas
From: Rui Maciel (rui.maciel_at_[hidden])
Date: 2010-05-10 12:47:47

I've just managed to migrate a small finite element application that I'm
writing from ublas to eigen and I have to say that I've saw an abysmal
difference in performance.

I've migrated my code in the following two steps:

The first one consisted in migrating the global stiffness matrix, global nodal
force vector and solver (in effect, the part that dealt with the K*d=f
equation) from ublas and custom code to eigen. In short, this migration
consisted in replacing ublas' compressed_matrix with eigen's
DynamicSparseMatrix and replacing ublas dense vector with an object of eigen's
Matrix<double, Dynamic,1> class.

As a result, this step alone lead my small pet program to go from taking over
6 minutes to run the analysis down to around 20 seconds. Granted,
I had implemented the solvers myself without much info concerting the inner
workings of ublas' components, which means that they certainly suffered from
performance problems. Nonetheless, I've implemented 3 different solvers (Gauss
factorization with partial pivoting, Cholesky decomposition and conjugate
gradient method) and all three solvers took grossly the same order of time to
solve a given system, including the cg method which is basically a series of
algebraic operations.

Having finished that step I've moved on to migrate the remaining ublas code to
eigen. The second part consisted of a hand full of dense matrices which were
subjected basically to a series of matrix assignments and multiplications,
along with the inversion and the calculation of the determinant of a 3x3
matrix. This step sliced the time it took to run the analysis from around 20
seconds down to 5 seconds.

So, summing things up, migrating from ublas and a set of hand-made solvers to
eigen made it possible for my program to go from taking over 6 minutes to
solve a simple problem to taking around 5 seconds to perform the same task.

Again, I acknowledge that certainly my sloppy code had a lot to do with that
abysmal performance penalty experience in the ublas version of my program.
Nonetheless this problem could be at least avoided in part if the
documentation was improved in key areas, such as common gotchas associated
with sparse matrices and the efficiency associated with basic operations.

Also, through my migration it was also possible to notice that ublas is far
from efficient even when used to perform simple tasks such as products between
smallish dense matrices (from 3x3 to 81x6) and between dense matrices and
dense vectors, a aspect of ublas whose tuning was supposed to be focused on.
No matter how sloppy any code is, if your code takes a 3.4x performance
penalty just for performing basic tasks such as products between small dense
data types... Well, that is a good sign that something isn't working right.
I'm aware that there were no promises made regarding efficiency but a difference
of this magnitude leaves a lot to be desired.

Rui Maciel