Boost logo

Boost :

From: boost (boost_at_[hidden])
Date: 2001-11-20 17:19:58


On Tuesday 20 November 2001 16:33, walter_at_[hidden] wrote:

> > Are there any blocking techniques in the current ublas library ?
> One of the objectives for ublas is to reduce the abstraction penalty
> when using vector and matrix abstractions. Therefore we use Todd
> Veldhuizen's expression template technique to eliminate temporaries
> and fuse loops and the Barton-Nackman trick to avoid virtual function
> call overhead.
Thanks for the clarification.
In my problem a major part of the computing time in spent in matrix-matrix
multiplications of various forms, with dimensions ranging from tiny up to
~1000, with the typical size of 100..400. (All these small blocks a part
of a big matrix, and the vectors are stored in a dyadic product of
2 vector spaces, i.e. represented by a list of matrices).

> For large matrices it's indicated (as you mention) to go the other
> way: reintroduce temporaries and split loops. So I imagine, that the
> usage of blocking techniques is a possible extension of ublas.
> BTW, netlib BLAS doesn't implement blocking, but LAPACK does AFAIK ;-)
> > I'm asking since I would need a performance which is comparable
> > to BLAS/ATLAS.
> Do you want to get netlib (reference) BLAS performance? Do you want
I havn't been in the computational physics for 4 years now, I reentered
this field only 2 weeks ago. In my old code I used vendor supplied
BLAS (e.g. IBM essl, ... ).

> to get ATLAS performance without any platform specific optimization?
> If we tune for a certain platform, which compiler/operating system
> combination do you prefer?
That's a pretty hard question. The programm will at least run on
Athlon PCs (Linux-Cluster), ibm rs6k (SP) Power2/3/4, HP PA, and
SGI Origins.
I'd be happy if I could replace (specialize) a few routines of ublas
by ATLAS or vendor supplied BLAS routines in order to perform benchmarks,
(mainly _x_gemm).

Best wishes

Boost list run by bdawes at, gregod at, cpdaniel at, john at