|
Ublas : |
From: Peter Schmitteckert (peter_at_[hidden])
Date: 2005-08-25 01:08:03
Dear Gunter,
On Wed, 24 Aug 2005, Gunter Winkler wrote:
> The table lists the times for performing the cholesky decomposition of
> different matrix types and sizes. The first line of each block shows the
> decomposition time of a dense matrix (only accessing the lower triangle). The
> second line uses a (lower) triangular_matrix. This should at least be as fast
> as the dense case but takes 3 times as long. The third line show the times
> for a banded matrix with (at most) 50 bands below the diagonal. (If the
> banded matrix has all maximal possible bands it is even slower than the
> triangular matrix.)
>
> I suspect the matrix rows/columns and vector ranges to be the source of the
> slow down. So I have to do more research on that.
Just as a hint, in my application I suffer large performance problems
from various matrix_assign calls, which are very pronounced on
the Itanium architecture.
> see http://www.bauv.unibw-muenchen.de/~winkler/ublas/examples/
>
> Do you have an idea how to reduce the slow down?
I've changed the following lines to be able to compile
with boost_1_33:
// use dense matrix
ublas::matrix<DBL> A ( ublas::zero_matrix<DBL>(size, size) );
ublas::matrix<DBL> T (size, size);
ublas::matrix<DBL> L (size, size);
//A = ublas::zero_matrix<DBL>(size, size);
and compiled with
g++-3.3 -I /home/peter/Boost/boost-stable -O2 -g -pg cholesky_test.cpp -o C
then a
./C 100
qprof C > Q.out
Most of the time is spend in the access function like operator [], (),
element etc. And the timinig indictaed, that the access for the
triangular matrices is more expensive then for the normal.
Of course, I made the usual mistake and repeated the profiliung with
g++-3.3 -DNDEBUG -I /home/peter/Boost/boost-stable -O2 -g -pg cholesky_test.cpp -o C
Now I find that most of the time is spent in
time seconds seconds calls ms/call ms/call name
45.46 0.45 0.45 1 450.05 563.67 void
boost::numeric::ublas::indexing_matrix_assign<boost::numeric::ublas::scalar_assign,
boost::numeric::ublas::matrix<double,
boost::numeric::ublas::basic_row_major<unsigned int, int>,
boost::numeric::ublas::unbounded_array<double, std::allocator<double> > >,
boost::numeric::ublas::matrix_matrix_binary<boost::numeric::ublas::matrix<double,
boost::numeric::ublas::basic_row_major<unsigned int, int>,
boost::numeric::ublas::unbounded_array<double, std::allocator<double> > >,
boost::numeric::ublas::matrix_unary2<boost::numeric::ublas::matrix<double,
boost::numeric::ublas::basic_row_major<unsigned int, int>,
boost::numeric::ublas::unbounded_array<double, std::allocator<double> > >,
boost::numeric::ublas::scalar_identity<double> >,
boost::numeric::ublas::matrix_matrix_prod<double, double, double> >
>(boost::numeric::ublas::matrix<double,
boost::numeric::ublas::basic_row_major<unsigned int, int>,
boost::numeric::ublas::unbounded_array<double, std::allocator<double> >
>&,
boost::numeric::ublas::matrix_expression<boost::numeric::ublas::matrix_matrix_binary<boost::numeric::ublas::matrix<double,
boost::numeric::ublas::basic_row_major<unsigned int, int>,
boost::numeric::ublas::unbounded_array<double, std::allocator<double> > >,
boost::numeric::ublas::matrix_unary2<boost::numeric::ublas::matrix<double,
boost::numeric::ublas::basic_row_major<unsigned int, int>,
boost::numeric::ublas::unbounded_array<double, std::allocator<double> > >,
boost::numeric::ublas::scalar_identity<double> >,
boost::numeric::ublas::matrix_matrix_prod<double, double, double> > >
const&, boost::numeric::ublas::row_major_tag)
Interestingly, -funroll-loops seems to help the first version slightly,
but not the second one.
Best regards,
Peter