|
Boost : |
From: boost_at_[hidden]
Date: 2001-03-26 14:57:00
Salut
[...]
--- In boost_at_y..., lums_at_l... wrote:
> --- In boost_at_y..., boost_at_s... wrote:
> > the left, and B_i appears on the right.
> > So using different storage layouts, I can increase performance
> > since no strides appear.
>
> For dense time dense the storage layout of the original matrices
is
> somewhat irrelevant for performance issues. By performing
> appropriate blocking and re-arranging for optimal layout within
the
> working blocks, one can get optimal performance for any original
> layout. ATLAS, PhiPAC, optimized BLAS, and MTL all do this AFAIK.
>
> That is, yes, layout is important, but only in the working blocks.
> Any original storage format can be copied to a good orientation in
> the working blocks within the mat-mult algorithm.
O.k., I never written blocked versions of matmuls, so I assume
you're right.
Nevertheless one should provide both memory layouts, allthough
one may name it differently, for representing transposed matrices.
Best wishes,
Peter
---------------------------
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk