|
Boost : |
From: Martin Weiser (weiser_at_[hidden])
Date: 2002-06-28 03:54:16
On Freitag, 28. Juni 2002 03:57, Matthew Hurd wrote:
> A factor of ten is a factor of ten, however...
>
> Everyone has always used these observed FLOPpy charts for comparison,
> however I think this is wrong.
> [...]
> Strassen's algorithm is n^(log7 = 2.81), but has higher overhead and is
> thus considered unsuitable for smaller n according to my text in front
> of me. Coppersmith and Winograd have an algorithm that goes at
> O(2.376).
>
> For n = 200 we are looking at 27 and 9.8 times the number of floating
> point operations for naive n^3 and Strassen over Coppersmith and
> Winograd. This ignores some possibly large constants.
Be careful with these fast matrix multiplications. I don't have a
reference at hand, but I remember Strassen's algorithm is numerically
unstable (for floating point arithmetics). Thus you trade performance for
accuracy, which is probably not what you'd expect from a linear algebra
library. I remember having heard that *any* matrix multiplication
algorithm using less than n^3 operations can be shown to be unstable.
Of course, for *exact* arithmetic (say, int, or rational), Strassen's
algorithm and its followers may be the method of choice.
> So I think benchmarking performance measuring FLOPS is noble in trying
> to get throughput clean but fraught with danger w.r.t. to true matrix
> multiplication throughput. For example, in the chart above, the uBLA
> implementation could be faster at doing the matrix calc than the Sun
> BLAs if uBLA was 27 times more efficient at FLOP rationing than the 10
> times FLOP throughput advantaged Sun BLA.
>
> Perhaps a better measure would be:
> n x n matrices per second * n ^ 3 = Matt's Notional FLOPs.
>
> Then again, perhaps you are going this already as the easiest thing to
> measure is just what I put above... If so I'm just going on about
> nothing. Sorry.
In fact, that's exactly what's shown in the graph. And I suppose this is
at least in leading order exactly the actual flops rate of both uBLAS and
Sun BLAS.
Yours,
Martin
-- Dr. Martin Weiser Zuse Institute Berlin weiser_at_[hidden] Scientific Computing http://www.zib.de/weiser Numerical Analysis and Modelling
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk