Boost logo

Ublas :

Subject: Re: [ublas] Matrix multiplication performance
From: Michael Lehn (michael.lehn_at_[hidden])
Date: 2016-02-16 20:19:38

On 16 Feb 2016, at 17:09, Michael Lehn <michael.lehn_at_[hidden]> wrote:

>>> About Blaze: Do they have their own implementation of a matrix-matrix product? It seems to require a
>>> tuned BLAS implementation (“Otherwise you get only poor performance”) for the matrix-matrix product.
>> I will check the benchmarks I run. I think I was using MKL with Blaze, but Blaze is taking it a step further (I am not sure how) and they are getting better performance than the underlying GEMM. Their benchmarks indicate that they are faster than MKL (!row-major-matrixmatrix-multiplication)
> I started today with similar experiments on BLAZE and had closer look at their internal implementation. By default
> they are calling an external BLAS backend. On my machine I used the Intel MKL. But you are right, they also have
> an internal implementation that can be used if no external BLAS is available. I will publish the results on this page:
> At the moment the benchmarks for the internal BLAZE implementation for the matrix-matrix product seem to look
> poor. I asked Klaus Iglberger (the author of BLAZE) to check the compiler flags that I have used. So don’t take the
> current results as-is.

I modified the benchmark code in favour for BLAZE as Klaus Iglberger’s suggested. The results are now slightly
better. However, compared to Intel MKL they are way off. Similar results I got on (my) Haswell. The problem seems
as stated by Klaus Iglberger: "The optimization flags are fine. The performance difference is due to the lack of a
specifically optimized kernel.” So on other platforms it might perform much better.

Dr. Michael Lehn
University of Ulm, Institute for Numerical Mathematics
Helmholtzstr. 20
D-89069 Ulm, Germany
Phone: (+49) 731 50-23534, Fax: (+49) 731 50-23548