|
Boost : |
Subject: [boost] BOOST::SIMD - handling double : precision vs speed
From: Joel Falcou (joel.falcou_at_[hidden])
Date: 2009-04-02 13:22:39
SIMD algorithms for double precision seem to be rather hard to do right.
It's difficult to get the right precision with respect to the scalar
reference as scalar algorithm take advantages of the internal 80 bits
floating points register, thus leading comparison between our
implementation and the reference to yields things like 3000 ulp (ie
10^-13 RMS instead of 10^16).
Fixing this is difficult and even if it's possible for some algorithms,
the average speed-up then drop to less than 10% - ie as fast as an
unrolled scalar call over the SIMD vector elements.
What should we enforce : precision or speed ? Or is the 10^-13 RMS enough ?
Of course, this is mostly a problem on current SIMD extension in which
vector of double only have 2 elements. It may be different with upcoming
AVX and Larabee featuring larger vectors.
Discussions welcome.
-- ___________________________________________ Joel Falcou - Assistant Professor PARALL Team - LRI - Universite Paris Sud XI Tel : (+33)1 69 15 66 35
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk