Date: 2001-11-21 05:58:40
--- In boost_at_y <mailto:boost_at_y>..., Peter Schmitteckert (boost) <
boost_at_s <mailto:boost_at_s>...> wrote:
> On Tuesday 20 November 2001 16:33, walter_at_g <mailto:walter_at_g>...
> > > Are there any blocking techniques in the current ublas library ?
> > One of the objectives for ublas is to reduce the abstraction
> > when using vector and matrix abstractions. Therefore we use Todd
> > Veldhuizen's expression template technique to eliminate
> > and fuse loops and the Barton-Nackman trick to avoid virtual
> > call overhead.
> Thanks for the clarification.
> In my problem a major part of the computing time in spent in matrix-
> multiplications of various forms, with dimensions ranging from tiny
> ~1000, with the typical size of 100..400.
Ok, we'll check the effect of blocked matrix multiply for these sizes.
> (All these small blocks a part
> of a big matrix, and the vectors are stored in a dyadic product of
> 2 vector spaces, i.e. represented by a list of matrices).
You've lost me. Could you please explain or give a reference?
> > > I'm asking since I would need a performance which is comparable
> > > to BLAS/ATLAS.
> > Do you want to get netlib (reference) BLAS performance? Do you
> > to get ATLAS performance without any platform specific
> > If we tune for a certain platform, which compiler/operating system
> > combination do you prefer?
> That's a pretty hard question. The programm will at least run on
> Athlon PCs (Linux-Cluster), ibm rs6k (SP) Power2/3/4, HP PA, and
> SGI Origins.
We only share the Intel/Linux platform.
> I'd be happy if I could replace (specialize) a few routines of ublas
> by ATLAS or vendor supplied BLAS routines in order to perform
> (mainly _x_gemm).
I think, this is one of the next steps as already discussed with Toon
Knapen. We'll also look at it.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk