Boost logo

Ublas :

From: Gunter Winkler (guwi17_at_[hidden])
Date: 2005-06-08 15:35:35


Am Dienstag, 7. Juni 2005 23:14 schrieb Jack Nguyen:
> I have intel xeon 1.7ghz w/ 3 level cache and have g++ on -O4 w/
> -DNDEBUG. My matrices are 1000x1000 squares filled w/ random complex
> float values. A,C are row-major. B is column-major.
> Ublas takes around 17s for the folllowing matrix multiplication
> methods: -C=prod(A,B);
> -C.assign(prod(A,B));
> - noalias(C)=block_prod< matrix<std::complex<float>, row_major >, 64>
> (A, B);
>
> A C implementation that calls blas3 takes around 1.74s

Yes, ublas can not (and will not) compete with atlas. You can play with
different matrix products and matrix sizes with the attached sample
program.

size of metrices - 500x500*500x500

      prod axpy opb block goto atlas
RRR 2.71 1.52 1.73 0.47 0.45 0.12
RRC 1.17 2.85 1.75 0.47 0.45 0.13
RCR 6.01 1.57 1.7 0.48 0.45 0.12
RCC 2.6 2.86 1.72 0.48 0.45 0.13
CRR 2.6 2.86 1.72 0.47 0.44 0.13
CRC 1.13 2.84 1.75 0.48 0.45 0.12
CCR 6.01 1.5 1.71 0.48 0.46 0.11
CCC 2.7 1.48 1.73 0.47 0.45 0.13

(first column gives storage orientation of X, A and B, other columns
present times on my Athlon XP (1466MHz) for X += A*B using different
products)

You see atlas is at least 4 times faster than ublas. For more details,
please, look at the source.

mfg
Gunter

PS: Thanks again to the one who once provided the nice macros ...

mfg
Gunter