Boost logo

Ublas :

Subject: [ublas] inplace_solve for row_major matrices
From: Jörn Ungermann (j.ungermann_at_[hidden])
Date: 2009-12-14 18:28:32


I have a sparse Cholesky decomposition and want to use it to solve a
linear equation system via inplace_solve:

int n; // n ~ 10000--100000
ublas::vector<double> x(n);
ublas::compressed_matrix<double, ublas::row_major> s_a_inv_root(n,n);
// fill x and s_a_inv_root
ublas::inplace_solve(trans(s_a_inv_root), x, ublas::lower_tag());
ublas::inplace_solve(s_a_inv_root, x, ublas::upper_tag());

This works like a charm, but takes astonishingly long to complete.
Digging deeper, I found that the second inplace_solve takes several
orders of magnitudes longer than the first one.
Digging even deeper, I found this in triangular.hpp:
    template<class E1, class E2>
    void inplace_solve (const matrix_expression<E1> &e1, vector_expression<E2> &e2,
                        lower_tag, row_major_tag) {
        typedef typename E1::storage_category storage_category;
        inplace_solve (e2, trans (e1),
                       upper_tag (), row_major_tag (), storage_category ());
This means that the row_major matrix is transposed (which should make it
column_major) and then treated like a row_major matrix.
No wonder, it takes forever.
The same is true for the inplace_solve(vector, matrix) kind of
functions, where the column_major case is not handled efficiently

Naively I would program something like this, which works fine on e1
being row_major and upper_tag:

    template<class E1, class E2>
    void inplace_solvex (const matrix_expression<E1> &e1, vector_expression<E2> &e2) {
        typedef typename E2::size_type size_type;
        typedef typename E2::difference_type difference_type;
        typedef typename E2::value_type value_type;

        BOOST_UBLAS_CHECK (e1 ().size1 () == e1 ().size2 (), bad_size ());
        BOOST_UBLAS_CHECK (e1 ().size2 () == e2 ().size (), bad_size ());
        size_type size = e1 ().size1 ();
        for (difference_type n = size-1; n >=0; -- n) {
            BOOST_UBLAS_CHECK (e1 () (n, n) != value_type/*zero*/(), singular ());
            if (e1 () (n, n) == value_type/*zero*/())
                singular ().raise ();
            value_type t = e2 () (n);
            typename E1::const_iterator2 it1e2 (e1 ().find2 (1, n, n+1));
            typename E1::const_iterator2 it1e2_end (e1 ().find2 (1, n, e1 ().size2 ()));
            while (it1e2 != it1e2_end) {
              t -= *it1e2 * e2 () (it1e2.index2());
              ++ it1e2;
            e2() (n) = t / e1 () (n, n);


This could be easily extend to cover all missing cases in

Now my questions:
1) Is there a special reason why this case is not efficiently treated?
2) If yes, what should I do to efficiently solve that equation system?
3) If no, should I flesh out "my" algorithm so that it can be added to
triangular.hpp and how should I proceed to do so?

Thanks and kind regards,
Joern Ungermann

Jörn Ungermann
Institut für Chemie und Dynamik der Geosphäre: Stratosphäre
Forschungszentrum Jülich
Telefon: +49 2461 61-1840
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt