|
Ublas : |
From: Paul C. Leopardi (leopardi_at_[hidden])
Date: 2005-03-13 10:24:32
Hi all,
Reply below.
Best regards
On Sun, 13 Mar 2005 08:46 pm, Dima Sorkin wrote:
> Quoting Gunter Winkler <guwi17_at_[hidden]>:
> > Am Freitag, 11. März 2005 18:15 schrieb Dima Sorkin:
> > > Hi.
> > > I have a following code:
> > > 1: sparse_matrix<double> m1(N,N);
> > > 2: // fill m1 ...
> > > 3: compressed_matrix<double> m2(m1);
> >
> > An Alternative would be to use a coordinate_matrix to fill the data. The
> > conversion from coordinate matrix to a compressed_matrix is very fast.
> > The third way is to precompute the structure of the compressed_matrix
> > and fill it using push_back(). See
> > http://www.bauv.unibw-muenchen.de/~winkler/ublas/sparse_comparison.html
> > for more details.
>
> Hi.
> Thank you.
> Please read this message till its end, since I believe I have some serious
> claim here.
>
> There is currently no way to detect the sparse structure or even number of
> non-zeros of matrix_expression, since it is unified to all (sparse and
> dense matrix types).This leads to the following results (because of use of
> Expr.Templates):
>
> (I must remark that I use boost release 1.31.1 ,but I have examined the
> 1.32 sources and it seems that the results will be the same)
>
> matrix_sparse<double> mt(N,N);
> compressed_matrix<double> ms1(N,N),ms2(N,N);//Later filled with N
> non-zeros,each
>
> compressed_matrix<double> ms3(-ms1); // O(N^2) time
> ms1 = mt ; //O(N^2) time
> mt = prod(ms1,ms2) // seems O(N^2) ,I didn't have time to check it
> individually
>
>
> Because of these ( and similar cases ) examples,I think that use of
> expression templates in sparse operations, as it done now, is serious
> performance bug of ublas. I may, however , miss some point. Can someone
> explain my mistake ?
>
> If I am right , then some kind of "sparse_matrix_expression" class should
> be introduced in ublas. For now I use some workaround: hand written
> functions that access individual elements of ublas sparse metrices , and
> give O(N),O(N*logN) performance in all cases I have. But for sure, they are
> far from being optimal.
I tried testing this using the code listed below, for N from 1 to 100.
My results were:
compressed_matrix<double> ms3(-ms1); // O(N)
ms1 = mt ; // O(N)
mt = prod(ms1,ms2); // O(N^3/log N)
The results for prod were most surprising.
Code snippet:
#include "test_ublas.h"
using namespace std;
using namespace boost::numeric::ublas;
using namespace ublas_test;
void test_trial(int n)
{
matrix_index_t N = n;
double total_neg_time = 0;
double total_ass_time = 0;
double total_sprod_time = 0;
int n_trials = 16384;
sparse_matrix<double> mt(N,N);
compressed_matrix<double> ms1(N,N),ms2(N,N);//Later filled with N non zeros
for (matrix_index_t k = 0; k != N; k++)
{
ms1(k,k) = 1.0/(1.0 + int(k));
ms2(k,N-k-1) = 1.0/ms1(k,k);
}
for (int trial_n = 0; trial_n != n_trials; trial_n++)
{
for (matrix_index_t k = 0; k != N; k++)
{
ms1(k,k) = 1.0/(1.0 + int(k));
mt(k,N-k-1) = 1.0/ms1(k,k);
}
clock_t cpu_time = clock();
compressed_matrix<double> ms3(-ms1);
total_neg_time += elapsed(cpu_time);
cpu_time = clock();
ms1 = mt ;
total_ass_time += elapsed(cpu_time);
}
int n_sprod_trials = 128;
clock_t cpu_time = clock();
for (int trial_n = 0; trial_n != n_sprod_trials; trial_n++)
{
mt = prod(ms1,ms2);
}
total_sprod_time += elapsed(cpu_time);
cout << "n: " << N
<< ", neg time: " << total_neg_time/n_trials
<< ", ass time: " << total_ass_time/n_trials
<< ", sprod time: " << total_sprod_time/n_sprod_trials
<< endl;
}