Subject: Re: [boost] Boost SIMD beta release
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2012-12-25 10:37:06
On Tue, Dec 25, 2012 at 7:10 PM, Joel Falcou <joel.falcou_at_[hidden]> wrote:
> Le 25/12/2012 15:43, Peter Dimov a écrit :
>> Mathias Gaunard wrote:
>>> The shifted iterator and the shifted load allow to do aligned loads if
>>> you statically know the misalignment of the memory.
>> Does this have any performance advantage over just using an unaligned
>> load? I'd expect the microcode to do whatever the shifted load does, but
>> I haven't measured it.
> Shifted load is a couple of aligned load + bit shuffling. This is a
> technique steming from way back on Altivec. Experiments done on 1D filtering
> using both show some benefits over unaligned load on pre-Nehalem CPUs.
AFAIK, even on post-Nehalem CPUs unaligned loads (and stores) are
slower if the operation spans across the cache line boundary. I don't
have the numbers though.
Will shifted_iterator use palignr from SSSE3?