Subject: Re: [ublas] norm_2_squared in a fast way
From: Gunter Winkler (guwi17_at_[hidden])
Date: 2009-09-06 15:20:09
Manoj Rajagopalan schrieb:
> Also there are some #defines which seem to be able to use existing SIMD. One
> of these is BOOST_UBLAS_USE_SIMD. I don't know how this works exactly - I'm
> guessing this flag is set/reset at configure-time prior to installation.
The BOOST_UBLAS_USE_SIMD define enables the use of duff's device (see
detail/duff.hpp) to do explicit loop unrolling. However I do not know if
it will give any benefit compared to the automatic loop-unrolling
features of modern compilers. If you really want the maximum performance
than you have to use ATLAS or similar optimized blas. (uBLAS'
performance of BLAS1 and BLAS2 is quite good, BLAS3 is not optimized at
all. BTW: uBLAS design goal was reasonable performance with a convenient