Subject: Re: [boost] [Endian] Performance
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2011-09-06 07:47:07
Arash Partow wrote:
> Phil Endecott wrote:
>> I've tested this on two systems:
>> (A): Marvell ARMv5TE ("Feroceon") @ 1.2 GHz, g++ 4.4
>> (B): Freescale ARMv7 (i.MX53, Coretex A8) @ 1.0 GHz, g++ 4.6.1
>> Compiled with: --std=c++0x -O4
> What do the numbers look like if you compile with -march=native switch
> set? or if you're cross-compiling replace 'native' with correct gcc
> supported arch,
"error: bad value (native) for -march= switch"
These are both native compilers that are already tuned to the systems
that they run on.
> To futher that, is there much of a difference when compiled with pgo?
> -fprofile-generate run -fprofile-use. btw I've found that in some situation
> O2 performs better than O3 or O4 - though pgo cleans up a lot of those
> inconsistencies at O3+ levels.
I'm sure that some tweaking could adjust the performance but mostly by
changing the benchmark infrastructure i.e. loop unrolling, interleaving
etc. The important thing here is what the actual byteswap gets
compiled to, and there are three discrete choices: the REV instruction,
word loads and stores with bit-twiddling, and byte loads and stores.
The results I've posted are sufficient to show how those perform and
which source code compiles to which implementation.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk