
Ublas : 
Subject: Re: [ublas] Matrix multiplication performance
From: Michael Lehn (michael.lehn_at_[hidden])
Date: 20160121 19:09:16
>> About Blaze: Do they have their own implementation of a matrixmatrix product? It seems to require a
>> tuned BLAS implementation (“Otherwise you get only poor performance”) for the matrixmatrix product.
> I will check the benchmarks I run. I think I was using MKL with Blaze, but Blaze is taking it a step further (I am not sure how) and they are getting better performance than the underlying GEMM. Their benchmarks indicate that they are faster than MKL (https://bitbucket.org/blazelib/blaze/wiki/Benchmarks#!rowmajormatrixmatrixmultiplication)
They use a log scale for the benchmarks. IMHO that does not make any sense. On this benchmark they only
are better for matrix dimensions smaller than 100. Even if you run the same implementations twice you get
fluctuations of that magnitude. At 1000 its identical. And outside of the C++ word I never saw logscales for
MFLOPS benchmarks. It makes sense if you compare the runtime of a O(N^k) algorithm. But I don’t see the
point for illustrating performance. All this started with the BTL (Benchmark Template Library).
But I will look into the BLAZE code to make sure (as in proof).