|
Boost : |
Subject: Re: [boost] interest in structure of arrays container?
From: Larry Evans (cppljevans_at_[hidden])
Date: 2016-10-31 10:14:08
On 10/31/2016 03:09 AM, Michael Marcin wrote:
> On 10/30/2016 7:45 AM, Larry Evans wrote:
>>
>> Would you post that somewhere? I'd be curious about how it
>> differs.
>>
>
> My code isn't very complete but since everyone else is sharing I'll post
> what I've got.
>
> FWIW I had to go back pretty far and then still make changes to get a
> version of your soa_compare.benchmark.cpp that compiled on windows VS2015.
>
> particle_count=1,000,000
> minimum duration=2.85542
>
> comparative performance table:
>
> method rel_duration
> ________ ______________
> Aos 3.08266
> SoA 1.51109
> Flat 1.50358
> StdArray 1.91317
> Block 1.60447
> SSE 1.35994
> SSE_opt 1.18056
> SSE_goon 1
> Press any key to continue . . .
>
>
> code:
> http://codepad.org/DECRpJrO
>
> test:
> http://codepad.org/IbdVcdq8
>
Thanks Michael. I found it interesting.
However, I was still getting the 'double free' error message; hence,
I tried val_grind. It showed a problem in the alive update loop.
When the code was changed to:
uint64_t *block_ptr = alive.data();
auto e_ptr = energy.data();
for ( size_t i = 0; i < n; ) {
#define REVISED_CODE
#ifdef REVISED_CODE
auto e_i = e_ptr + i;
#endif
uint64_t block = 0;
do {
#ifndef REVISED_CODE
//this code causes valgrind to show errors.
auto e_i = e_ptr + i;
#endif
_mm_store_ps( e_i, _mm_sub_ps( _mm_load_ps( e_i ), t ));
block |=
uint64_t
( _mm_movemask_ps( _mm_cmple_ps( _mm_load_ps( e_i ),
zero )))
<< (i % bits_per_uint64_t)
;
i += 4;
} while ( i % bits_per_uint64_t != 0 );
*block_ptr++ = block;
}
valgrind reported no errors; however, when !defined(REVISED_CODE),
valgrind reported:
valgrind --tool=memcheck
/tmp/build/clangxx3_8_pkg/clang/struct_of_arrays/work/soa_compare.benchmark.optim0.exe
==7937== Memcheck, a memory error detector
==7937== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==7937== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==7937== Command:
/tmp/build/clangxx3_8_pkg/clang/struct_of_arrays/work/soa_compare.benchmark.optim0.exe
==7937==
COMPILE_OPTIM=0
particle_count=1,000
frames=1,000
{run_test=SSEopt_vec
==7937== Invalid read of size 16
==7937== at 0x403D6B: emitter_t<(method_enum)6>::update()
(soa_compare.benchmark.cpp:962)
==7937== by 0x403094: run_result_t run_test<emitter_t<(method_enum)6>
>(unsigned long, unsigned long) (soa_compare.benchmark
My soa_compare.benchmark.cpp:962 line is:
_mm_store_ps( e_i, _mm_sub_ps( _mm_load_ps( e_i ), t ));
-regards,
Larry
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk