From: Martin Weiser (weiser_at_[hidden])
Date: 2002-06-28 06:48:01
On Freitag, 28. Juni 2002 00:24, Joerg Walter wrote:
> ----- Original Message -----
> From: "Martin Weiser" <weiser_at_[hidden]>
> > 3. Large matrices (N=40,..,250). The operands don't fit any longer
> > into the 1st level cache. As expected, the lower memory bandwidth of
> > the 2nd level cache decreases the performance of both uBLAS and the
> > naive implementation. Due to blocked algorithms the Sun BLAS
> > maintains its high performance and wins by a factor of about 7. The
> > great surprise is, that the naive implementation suffers less than
> > uBLAS and wins by a factor of about 1.7. Currently I've no
> > explanation for that.
> We're switching from indexing to iterating access at N = 32. You could
> play with NUMERICS_ITERATOR_THRESHOLD to see the impact on your
Thanks for the hint. GCC/UltraSparc seems to prefer indexing. The
performance drop attributed to the L1->L2 transition seems to be caused
by the indexing to iteration switch. Incidentally it happened at the same
matrix size where the L1->L2 transition must in fact be placed.
For the threshold at 1024, the new graph is available at
As a surprising side effect, the change seems to affect (improve) the
performance for small matrices (N<<32), too.
-- Dr. Martin Weiser Zuse Institute Berlin weiser_at_[hidden] Scientific Computing http://www.zib.de/weiser Numerical Analysis and Modelling
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk