|
Ublas : |
From: Gunter Winkler (guwi17_at_[hidden])
Date: 2005-03-13 16:55:41
Hello,
my results are a different than these of Paul.
Am Sonntag, 13. März 2005 10:46 schrieb Dima Sorkin:
> Because of these ( and similar cases ) examples,I think that use of
> expression templates in sparse operations, as it done now, is serious
> performance bug of ublas. I may, however , miss some point. Can
> someone explain my mistake ?
I don't think so, because each expression contains information about
storage layout, sparsity and so on. This information can be used (and
is used by most expressions). Of course it is a bug if the wrong
algorithm is used. I think matrix_assign does still not work perfectly.
> Minor remarks:
> By the way it was undocumented in ublas help that compressed_matrix
> has direct constructor from coordinate_matrix. According to ublas
> help it can be constructed from matrix_expression only.
That correct. There is no constructor from COO. But its easy to convert
the data. I could create such a constructor ...
> Second minor remark : in ublas documentation the arguments of sparse
> matrices constructors written in reverse order - nnzeros comes first.
Thanks. I should change this.
Now my results:
(compressed_matrix: cm1, cm2; mapped_matrix (was sparse_matrix) mm)
I first ran the expression constructor cm2(-cm1) with different sizes
and number of nonzeros (now: cm1.nnz()). The result was O(n^2) where n
is the size of the matrix.
2nd I ran an assign cm = mm
3rd I ran the product mm = prod(cm1, trans(cm2)) which gave O(n^2) where
n is the size of the matrix. (I also timed cm3 = prod(cm1, trans(cm2)))
size nnz nnz copy assign prod1 nnz_p1 prod2 nnz_p2
8000 100 100 0 6.26 13.67 104 14.79 104
8000 200 200 0 6.5 13.76 201 14.91 201
8000 400 400 0 6.76 14 401 15.1 401
8000 800 800 0 6.82 14.46 828 15.51 828
8000 1600 1600 0 6.51 15.24 1752 16.36 1752
8000 3200 3200 0.01 5.58 16.83 3977 17.73 3977
8000 6400 6400 0 3.83 19.09 9660 19.98 9660
8000 12800 12800 0 1.79 22.16 26934 23.03 26934
all times seem to be O(size^2) but nearly independent of nnz. Why? The
iterator produce very much overhead. (The decreasing times for the
assign are interesting ...)
I'll put more extended numbers on my page.
mfg
Gunter