|
Boost : |
Subject: Re: [boost] interest in structure of arrays container?
From: Larry Evans (cppljevans_at_[hidden])
Date: 2016-10-21 02:39:19
On 10/21/2016 01:07 AM, Michael Marcin wrote:
> On 10/21/2016 12:48 AM, Michael Marcin wrote:
>> On 10/20/2016 10:02 PM, Larry Evans wrote:
>>>
>>> The modification added soa_emitter_block_t which uses soa_block.
>>> Unfortunately, this soa_emitter_block_t takes about twice as long as
>>> your soa_emitter_static_t.
>>>
>>> I've no idea why. Any guesses?
>>>
>>
>> 2x is quite an abstraction penalty.
>> I can only assume your compiler is failing to optimize away some part of
>> the abstraction.
OOPS. Yeah, I forgot about run-time optimization compiler flags :(
>>
>> FWIW on vs2015 I'm not seeing nearly as much of a difference.
>>
>> particle_count=1,000,000
>> AoS in 6.34667 seconds
>> SoA in 4.26384 seconds
>> SoA flat in 4.16572 seconds
>> SoA Static in 5.4037 seconds
>> SoA block in 5.5588 seconds
>>
>
> I'm still trying to work out how to fit overaligned subarrays into your
> framework.
>
> The issue is that many simd instructions require more than just
> alignof(T) alignment.
>
> subarrays of float/double/int/short/char or carefully crafted udts might
> need to be aligned to as much as 64bytes in the worst case.
>
> On the MIC architecture, vector load/store operations
> must be called on 64-byte aligned memory addresses.
> On the Xeon architecture with AVX/AVX2 instruction sets
> (Sandy Bridge, Ivy Bridge or Haswell), alignment does not matter.
> In earlier architectures (Nehalem, Westmere) alignment did matter,
> but a 32-byte alignment was necessary.
>
> https://software.intel.com/en-us/forums/intel-many-integrated-core/topic/507547
>
>
> At the very least support for the basic SSE 16 byte alignment of
> subarrays is crucial.
>
>
> My best idea so far is some magic wrapper type that gets special
> treatment. Like:
> using data_t = soa_block< float3, soa_align<float,16>, bool >;
>
> This maybe opens the door for other magic types like:
> using data_t = soa_block< float3, soa_align<float,16>, soa_bit >;
>
That seems reasonable to me.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk