Boost logo

Boost :

Subject: Re: [boost] interest in structure of arrays container?
From: Michael Marcin (mike.marcin_at_[hidden])
Date: 2016-10-16 14:45:48


On 10/16/2016 1:59 AM, degski wrote:
>
> You state that the example is a toy example. To me the example shows that
> iterating over a vector of smaller objects (pods) is faster than iterating
> over a vector of larger objects, duh.

Iterating over a vector of smaller objects being faster than iterating
over a vector of larger objects is the core of the SoA approach.

> The real use case might be more
> interesting, maybe you can describe it.
>

A common real use case is a particle system in a game.

A very simple particle could look like:
A real particle will often have many more fields.

struct particle_t {
     float3 position;
     float3 velocity;
     float3 acceleration;
     float2 size;
     float4 color;
     float energy;
     bool alive;
};

Every Tick you need to update all of the particles.

     void update() {
         size_t n = position.size();
         for ( size_t i = 0; i < n; ++i ) {
             auto & v = velocity[i];
             v = v + (acceleration[i] * dt);
             v.y += gravity * dt;

             position[i] = position[i] + (v * dt);
             energy[i] -= dt;
             if ( energy[i] <= 0 ) {
                 alive[i] = false;
             }
         }
     }

Note that update() doesn't read or write several fields, these fields
are only used for rendering. You can get a decent win purely through
improved cache utilization by keeping this cold data out of the way.

You can then think about going one step further and building a simd
vectorized update loop, which is strictly impossible with an Array of
Structures approach. There are two obvious ways to go about vectorizing
this update.
  1. You can store position/velocity/acceleration as float4 instead of
float3 and operate on xyz for a single particle in one instruction.
  2. You can store each component in its own array and operate on 4
particles at once.

This is still a toy example but it's closer to something real.

http://codepad.org/r9u82dOh

AoS in 6.54421 seconds
SoA in 5.91915 seconds
SoA SSE in 3.58603 seconds
Press any key to continue . . .

Note: there is still room for improvement in the SSE example


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk