Boost logo

Boost :

Subject: Re: [boost] interest in structure of arrays container?
From: Larry Evans (cppljevans_at_[hidden])
Date: 2016-10-25 17:17:45


On 10/25/2016 12:41 PM, Larry Evans wrote:
> On 10/25/2016 12:22 PM, Larry Evans wrote:
> [snip]
>>
>> From the above, the LibFlatArray and SSE methods are the
>> fastest. I'd guess that a new "SoA block SSE" method, which
>> uses the _mm_* methods, would narrow the difference. I'll
>> try to figure out how to do that. I notice:
>>
>> #include <mmintrin.h>
>>
>> doesn't produce a compile error; however, that #include
>> doesn't have the _mm_add_ps used here:
>>
>> https://github.com/cppljevans/soa/blob/master/soa_compare.benchmark.cpp#L621
>>
>>
>>
>> Do you know of some package I could install on my ubuntu OS
>> that makes those SSE functions, such as _mm_add_ps,
>> available?
> [snip]
> Never mind. Google for:
>
> __mm128
>
> lead to:
>
>
> http://stackoverflow.com/questions/11679741/vector-of-mm128-wont-push-back
>
> and change of #include to:
>
> #include <emmintrin.h>
>
> which solved problem.
>
> particle_count=1,024
> frames=1,000
> minimum duration=0.0371714
>
> comparitive performance table:
>
> method rel_duration
> ________ ______________
> SSE_opt 0.330574
> SSE 0.440405
> Flat 0.904265
> SoA 0.911574
> Block 0.97398
> AoS 1
> StdArray 1.15079
> LFA undefined
>
OOPS. Another copy&paste careless error. Output should be:

--{--cut here--
particle_count=1,000,000
frames=1,000
minimum duration=3.5909

comparitive performance table:

method rel_duration
________ ______________
SSE_opt 1
SSE 1.01568
StdArray 1.44133
Flat 1.44861
Block 1.45053
SoA 1.52935
AoS 2.10294
LFA undefined

Compilation finished at Tue Oct 25 16:12:45
--}--cut here--
which clearly shows SSE_opt as fastest.

-regards,
Larry

-regards,
Larry


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk