Boost logo

Boost :

Subject: Re: [boost] interest in structure of arrays container?
From: Larry Evans (cppljevans_at_[hidden])
Date: 2016-10-31 10:14:08


On 10/31/2016 03:09 AM, Michael Marcin wrote:
> On 10/30/2016 7:45 AM, Larry Evans wrote:
>>
>> Would you post that somewhere? I'd be curious about how it
>> differs.
>>
>
> My code isn't very complete but since everyone else is sharing I'll post
> what I've got.
>
> FWIW I had to go back pretty far and then still make changes to get a
> version of your soa_compare.benchmark.cpp that compiled on windows VS2015.
>
> particle_count=1,000,000
> minimum duration=2.85542
>
> comparative performance table:
>
> method rel_duration
> ________ ______________
> Aos 3.08266
> SoA 1.51109
> Flat 1.50358
> StdArray 1.91317
> Block 1.60447
> SSE 1.35994
> SSE_opt 1.18056
> SSE_goon 1
> Press any key to continue . . .
>
>
> code:
> http://codepad.org/DECRpJrO
>
> test:
> http://codepad.org/IbdVcdq8
>
Thanks Michael. I found it interesting.

However, I was still getting the 'double free' error message; hence,
I tried val_grind. It showed a problem in the alive update loop.
When the code was changed to:

         uint64_t *block_ptr = alive.data();
         auto e_ptr = energy.data();
         for ( size_t i = 0; i < n; ) {
           #define REVISED_CODE
           #ifdef REVISED_CODE
             auto e_i = e_ptr + i;
           #endif
             uint64_t block = 0;
             do {
               #ifndef REVISED_CODE
                 //this code causes valgrind to show errors.
                 auto e_i = e_ptr + i;
               #endif
                 _mm_store_ps( e_i, _mm_sub_ps( _mm_load_ps( e_i ), t ));
                 block |=
                   uint64_t
                   ( _mm_movemask_ps( _mm_cmple_ps( _mm_load_ps( e_i ),
zero )))
                   << (i % bits_per_uint64_t)
                   ;
                 i += 4;
             } while ( i % bits_per_uint64_t != 0 );
             *block_ptr++ = block;
         }

valgrind reported no errors; however, when !defined(REVISED_CODE),
valgrind reported:

valgrind --tool=memcheck
/tmp/build/clangxx3_8_pkg/clang/struct_of_arrays/work/soa_compare.benchmark.optim0.exe
==7937== Memcheck, a memory error detector
==7937== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==7937== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==7937== Command:
/tmp/build/clangxx3_8_pkg/clang/struct_of_arrays/work/soa_compare.benchmark.optim0.exe
==7937==
COMPILE_OPTIM=0
particle_count=1,000
frames=1,000
{run_test=SSEopt_vec
==7937== Invalid read of size 16
==7937== at 0x403D6B: emitter_t<(method_enum)6>::update()
(soa_compare.benchmark.cpp:962)
==7937== by 0x403094: run_result_t run_test<emitter_t<(method_enum)6>
>(unsigned long, unsigned long) (soa_compare.benchmark

My soa_compare.benchmark.cpp:962 line is:

                 _mm_store_ps( e_i, _mm_sub_ps( _mm_load_ps( e_i ), t ));

-regards,
Larry


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk