Boost logo

Boost :

Subject: Re: [boost] interest in structure of arrays container?
From: Michael Marcin (mike.marcin_at_[hidden])
Date: 2016-10-15 04:14:00


On 10/14/2016 11:32 PM, degski wrote:
> On 15 October 2016 at 05:43, Michael Marcin <mike.marcin_at_[hidden]> wrote:
>
>> Does this already exist somewhere?
>>
>
> Take a look at Boost.DoubleEnded 1 <http://erenon.hu/double_ended/> and see
> whether boost::double_ended::batch_deque
> <http://erenon.hu/double_ended/double_ended/design_rationale.html#double_ended.design_rationale.batch_deque>
> fills all of your requirements.
>

Sorry I think I wasn't clear.

Let me back up a bit and give some examples.

Arrays of Structures is the normal programming model.

Let's take a toy example, say I'm looking at some census data and have a
structure that looks like:

struct citizen_t {
     string first_name;
     string last_name;
     int salary;
     int age;
};

Then the Array of Structures container would be:
vector<citizen_t> aos_citizens;

The Structure of Arrays version of the same data would look like:

struct soa_citizens_t {
     vector<string> first_names;
     vector<string> last_names;
     vector<int> salary;
     vector<int> age;
};

soa_citizens_t soa_citizens;

Why does this matter?

Let's say I want to calculate the average salary of 300 million
citizens. The code is a simple iterative average and very simple.

// Array of Structures
int avg = 0;
int t = 1;
for ( auto & c : aos_citizens ) {
     avg += (c.salary - avg) / t;
     ++t;
}

// Structures of Arrays
int avg = 0;
int t = 1;
for ( int salary : soa_citizens.salary ) {
     avg += (salary - avg) / t;
     ++t;
}

Run this under a simple benchmark:
https://ideone.com/QNqKpD

AoS 3.03523 seconds
SoA 1.94902 seconds

Both loops are doing the exact same work but the memory layout allows
the second loop to run much faster.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk