Date: 2001-03-26 14:57:00
--- In boost_at_y..., lums_at_l... wrote:
> --- In boost_at_y..., boost_at_s... wrote:
> > the left, and B_i appears on the right.
> > So using different storage layouts, I can increase performance
> > since no strides appear.
> For dense time dense the storage layout of the original matrices
> somewhat irrelevant for performance issues. By performing
> appropriate blocking and re-arranging for optimal layout within
> working blocks, one can get optimal performance for any original
> layout. ATLAS, PhiPAC, optimized BLAS, and MTL all do this AFAIK.
> That is, yes, layout is important, but only in the working blocks.
> Any original storage format can be copied to a good orientation in
> the working blocks within the mat-mult algorithm.
O.k., I never written blocked versions of matmuls, so I assume
Nevertheless one should provide both memory layouts, allthough
one may name it differently, for representing transposed matrices.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk