|
Boost : |
Subject: Re: [boost] [xpressive] Performance Tuning?
From: Edward Grace (ej.grace_at_[hidden])
Date: 2009-07-28 09:35:50
>>
>> Can you post the warnings? Today's warnings are tomorrows errors!
>> I will
>> attack them and hopefully iron them out - I too like clean build
>> logs.
>
> Sure, let me separate yours out from that rather bloody massive
> warning that the xpressive code generates:
> 1>r:\programming_projects\spirit_price\price_parsing\other_includes
> \ejg\timer.cpp(468)
> : warning C4267: 'initializing' : conversion from 'size_t' to
> 'unsigned int', possible loss of data
> 1> r:\programming_projects\spirit_price\price_parsing
> \other_includes\ejg\timer.cpp(558)
> : see reference to function template instantiation
> 'ejg::timer_result_type
> &ejg::generic_timer<ticks>::measure_execution_result<_Operation>
> (_Operation,ejg::timer_result_type
[... more reams of stuff... ]
Someone on this list mentioned something about intractable template
warnings....
...I can't imagine what the fuss is about.
It looks like a subtle interaction with boost::bind and the expected
form of the functor
f()
which is marvellously non-specific about any return types. I will
take a look later...
>> You should only need 1 call. The timer code should work out how many
>> iterations it needs in order to obtain a satisfactory answer. In
>> fact, going
>> crazy, you should be able to reliably measure the speedup of
>> parsing a
>> single character! ;-)
>
> Which is how I understood it, which is why I turned it down to 1
> iteration. :)
Good show.
>>> Calibrating overhead......done
>>> Timer overhead (t_c) ~= : 14.6667
>>> Jitter ~= : 0.633371
>>
>> If you're employing getticks from FFTW's cycle.h perhaps it's not
>> returning
>> actual clock ticks, (e.g. from a Pentium cycle counter).
>>
>> On my machine (Intel Core 2 - OS X) the timer overhead is ~109
>> ticks == 109
>> clock cycles.
>
> I am using the cycle.h file, and I am pretty sure it was...
Ok.
>> Do you think the overhead of calling through boost::bind could be
>> comparable
>> to the length of time it takes to run the function?
>
> No, I just tested, it is negligeable
Good! One less thing to worry about.
> On Tue, Jul 28, 2009 at 6:32 AM, Edward
> Grace<ej.grace_at_[hidden]> wrote:
>> I suggest something that simply iterates over the test data but
>> does not
>> check for correctness of parsing. Although it won't make a fat
>> lot of
>> difference in this case at least it's then consistent - you're
>> timing the
>> parsers not the tests for equality. The correctness test could
>> then be done
>> later once the timings are complete.
>
> I actually already did that, however I kept getting warnings about the
> measure_percentage_speedup function not being able to do something
> with the template arguments for the function calls, which were just
> standard void(void) functions, confuzzling. May look at that later,
> almost bed time.
"Confuzzling" - I like that. Perhaps boost::bind gets up to some
mischief. Let me know how you get on. I'm still trying to figure
out why the tests I ran yielded a ~10x speedup for Spirit.
Perhaps you could try a more canonical test - running
"ejg_uint_parser_0_0_4.cpp"
That does not make use of boost::bind - but tries to avoid the
optimizer getting rid of void(void) functions by using globals.
>> Does the size of the test data set matter? In other words do you
>> notice
>> similar speedups if the test data will all fit in cache?
>
> His input data is a very detailed test that tests just about every
> possible input, which can have different speeds for different ones, so
> I think it would be a good overall test to keep and parse all 147k
> values, perhaps if there was some way to test them all individually
> using ejg and get a nice report? ;-)
Perhaps (one day) it could be informative to test subsets of the
data. For example you may find Spirit is unusually slow at parsing
certain patterns e.g. "2222" is 20% slower than "1111" - this could
(speculatively) point towards some deep and subtle changes that could
be made.
From what I've seen so far however there's plenty of work to be done
on Xpressive in closing the existing performance gap. ;-)
Cheers,
-ed
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk