Subject: Re: [boost] Boost SIMD beta release
From: Tim Blechmann (tim_at_[hidden])
Date: 2012-12-20 05:33:18
>> > * does boost.simd support horizontal operations (fold/reduce/accumulate)?
> The reduction toolbox provides vector horizontal operation.
> You can then run std::accumulate with a proper functor and have
> the simd_iterator kicks in and vectorize your large data reduction.
ok, but you might be able to use native simd instructions to do
horizontal operations. for sse, i'm using this for providing the
horizontal minimum for example:
-- __m128 data; /* [0, 1, 2, 3] */ __m128 low = _mm_movehl_ps(data, data); /* [2, 3, 2, 3] */ __m128 low_accum = _mm_min_pslow, data); /* [0|2, 1|3, 2|2, 3|3] */ __m128 elem1 = _mm_shuffle_ps(low_accum, low_accum, _MM_SHUFFLE(1,1,1,1)); /* [1|3, 1|3, 1|3, 1|3] */ __m128 accum = _mm_min_ss(low_accum, elem1); return _mm_cvtss_f32(accum); -- also, sse3 can be used to compute a horizontal sum: -- __m128 accum1 = _mm_hadd_ps(data_, data_); /* [0+1, 2+3, 0+1, 2+3] */ __m128 elem1 = _mm_shuffle_ps(accum1, accum1, _MM_SHUFFLE(1, 1, 1, 1)); /* [2+3, 2+3, 2+3, 2+3] */ __m128 result = _mm_add_ps(accum1, elem1); return _mm_cvtss_f32(result); -- >> > * how does boost.simd deal with stack alignment? on win32 i ended up >> > compiling with -mstackrealign, though it is said not to be the best >> > solution (didn't have a closer look) > Good question, IIRC we tried ot satisfy the ABI requirement of passing > SIMD type > in structures over to functions properly by either usign references at > the upper level > and then work with register type as soon as possible. This is still > suboptimal in some > cases. Maybe Mathias can give you more insight on this subject. once the compiler decides to allocate the register type to the stack, one needs to care about stack alignment (compare ). i was hoping that the compiler will try to ensure the alignment automatically, but unfortunately it doesn't :/ cheers, tim  http://stackoverflow.com/questions/2386408/qt-gcc-sse-and-stack-alignment
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk