
Ublas : 
From: Gunter Winkler (guwi17_at_[hidden])
Date: 20050608 15:35:35
Am Dienstag, 7. Juni 2005 23:14 schrieb Jack Nguyen:
> I have intel xeon 1.7ghz w/ 3 level cache and have g++ on O4 w/
> DNDEBUG. My matrices are 1000x1000 squares filled w/ random complex
> float values. A,C are rowmajor. B is columnmajor.
> Ublas takes around 17s for the folllowing matrix multiplication
> methods: C=prod(A,B);
> C.assign(prod(A,B));
>  noalias(C)=block_prod< matrix<std::complex<float>, row_major >, 64>
> (A, B);
>
> A C implementation that calls blas3 takes around 1.74s
Yes, ublas can not (and will not) compete with atlas. You can play with
different matrix products and matrix sizes with the attached sample
program.
size of metrices  500x500*500x500
prod axpy opb block goto atlas
RRR 2.71 1.52 1.73 0.47 0.45 0.12
RRC 1.17 2.85 1.75 0.47 0.45 0.13
RCR 6.01 1.57 1.7 0.48 0.45 0.12
RCC 2.6 2.86 1.72 0.48 0.45 0.13
CRR 2.6 2.86 1.72 0.47 0.44 0.13
CRC 1.13 2.84 1.75 0.48 0.45 0.12
CCR 6.01 1.5 1.71 0.48 0.46 0.11
CCC 2.7 1.48 1.73 0.47 0.45 0.13
(first column gives storage orientation of X, A and B, other columns
present times on my Athlon XP (1466MHz) for X += A*B using different
products)
You see atlas is at least 4 times faster than ublas. For more details,
please, look at the source.
mfg
Gunter
PS: Thanks again to the one who once provided the nice macros ...
mfg
Gunter