# Ublas :

From: Per Abrahamsen (per.abrahamsen_at_[hidden])
Date: 2008-01-21 07:55:40

The core of my application is basically solving a linear equation system:

A x = b

Where A is mostly zero. After using a sparse equation solver
(cxsparse), I found hat most of the time is spend a surprising place,
namely in matrix addition. A (a compressed matrix, because that is
what cxsparse expects) is build by adding three zero width banded
matrixes (B1, B2, and B3) and three compressed matrixes (C1, C2, and
C3).

A = B1 + B2 + B3 + C1 + C2 + C3

I'd expected the solver to take most of the time, followed by the
building of the component matrixes (B1..B3 and C1..C3). Not so.
Everything else is noise compared to the time spend in matrix
addition. Something like 2 seconds in the solver, 20 seconds building
the component matrixes, and 3 minutes adding them together.

I tried using different matrix types, everything made the situation
worse except 1) using dense matrixes, and 2) using temporary
coordinate matrix and adding the compressed matrixes with this
function:

static void
add_to (const ublas::compressed_matrix<double>& from,
ublas::coordinate_matrix<double>& to)
{
for (ublas::compressed_matrix<double>::const_iterator1 i1 = from.begin1 ();
i1 != from.end1 ();
++i1)
for (ublas::compressed_matrix<double>::const_iterator2 i2 = i1.begin ();
i2 != i1.end ();
++i2)
to.append_element (i2.index1 (), i2.index2 (), *i2);
}

and the banded matrixes with an index loop:

for (size_t c = 0; c < cell_size; ++c)
A_coord.append_element (c, c, B1 (c, c) + B2 (c, c) + B3 (c, c));

Both approaches put the time spend in the addition (and converting to
compressed) down to 1 minute. Still, three times as much time spend
on addition than on building the component matrixes. Any suggestions
for how to improve on this?

The only idea I have left is to drop the component matrixes, and build
A directly. But I fear this would leave the code very hard to
maintain.

PS: I build with -DNDEBUG and -O3.