Boost logo

Boost :

Subject: Re: [boost] [convert] Performance
From: Joel de Guzman (djowel_at_[hidden])
Date: 2014-06-10 19:03:24


On 6/11/14, 4:55 AM, Andrey Semashev wrote:
> On Wednesday 11 June 2014 06:46:53 Vladimir Batov wrote:
>>
>> And indeed BOOST_ASSERT seems to be heavier than BOOST_TEST due to
>> expression-validity check done with
>>
>> __builtin_expect(expr, 1)
>
> It's not a validity check, it's a hint to the compiler to help branch
> prediction. Assertion failures are assumed to be improbable.
>
> In any case, when testing performance you should be building in release mode,
> where all asserts are removed.

Benchmarks are a black art. See how we do our performance tests in Spirit:

   https://github.com/boostorg/spirit/blob/master/workbench/qi/int_parser.cpp

You can use our benchmark facility where all the black art is contained:

   https://github.com/boostorg/spirit/blob/master/workbench/measure.hpp

using this strategy:

         // Strategy: because the sum in an accumulator after each call
         // depends on the previous value of the sum, the CPU's pipeline
         // might be stalled while waiting for the previous addition to
         // complete. Therefore, we allocate an array of accumulators,
         // and update them in sequence, so that there's no dependency
         // between adjacent addition operations.
         //
         // Additionally, if there were only one accumulator, the
         // compiler or CPU might decide to update the value in a
         // register rather that writing it back to memory. we want each
         // operation to at least update the L1 cache. *** Note: This
         // concern is specific to the particular application at which
         // we're targeting the test. ***

         // This has to be at least as large as the number of
         // simultaneous accumulations that can be executing in the
         // compiler pipeline. A safe number here is larger than the
         // machine's maximum pipeline depth. If you want to test the L2
         // or L3 cache, or main memory, you can increase the size of
         // this array. 1024 is an upper limit on the pipeline depth of
         // current vector machines.

A naive test implementation will give you *funny* results, depending
on the machine you are running on.

HTH.

Regards,

-- 
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk