Boost logo

Glas :

Re: [glas] dense and sparse vectors

From: Andrew Lumsdaine (lums_at_[hidden])
Date: 2005-08-07 16:23:04


On Aug 7, 2005, at 3:32 PM, Toon Knapen wrote:
> I was mainly planning to benchmark 'sparse axpy' of different
> libraries
> (I'm currently using mtl-v2.1.22, oski-1.0 and ublas). I indeed
> mentioned 'dense' also but mainly because I also plan to benchmark the
> sparse axpy against the dense versions.
>
> As for having a 'dense axpy' implementation in glas itself: Glas
> should
> probably not try to outperform vendor-blas axpy's: the vendor-
> optimised
> axpy's are already there, hard to compete with although not generic.
> That's also we stated in the requirements of glas to be able to use
> 3rd
> party backends (http://glas.sourceforge.net/doc/requirements.html).

Thatt depends. For dense axpy's on a risc microprocessor it is
fairly straightforward to compete with vendor-optimised axpy's. The
algorithm is not that complicated and the performance limitation is
not due to the magic of coding, compilation, or algorithmic
expression -- but rather to memory bandwidth. All that is
(typically) required is that the compiler support some form of
"restrict" and even the basic expression of an axpy operation can be
well optimized.

The situation is somewhat different for vector machines where memory
bandwidth is not the main performance limitation (and where C++
compilers are not as highly advanced).

I continue to be of the opinion (based on our experiences with MTL)
that, in most cases, a generic C++ library can meet the performance
of vendor-tuned libraries.

I should also mention that there are really two aspects of
performance that one wants to measure with benchmarking. One is to
measure the raw performance of a given algorithm against the
achievable maximum (whether that level is based on vendor-tuned
library or on theoretical models). The second is to measure the
performance of a generic algorithm against its hand-coded equivalent
(i.e., the "abstraction penalty"). It's probably a good idea to have
an abstraction penalty suite to validate different compilers, coding
idioms, etc.