Boost logo

Boost :

Subject: Re: [boost] [xpressive] Performance Tuning?
From: OvermindDL1 (overminddl1_at_[hidden])
Date: 2009-07-28 07:07:35


On Tue, Jul 28, 2009 at 4:15 AM, OvermindDL1<overminddl1_at_[hidden]> wrote:
> On Tue, Jul 28, 2009 at 3:11 AM, Edward Grace<ej.grace_at_[hidden]> wrote:
>>
>> On 28 Jul 2009, at 08:16, OvermindDL1 wrote:
>>
>>> On Mon, Jul 27, 2009 at 10:34 PM, OvermindDL1<overminddl1_at_[hidden]>
>>> wrote:
>>>>
>>>> On Mon, Jul 27, 2009 at 6:17 PM, OvermindDL1<overminddl1_at_[hidden]>
>>>> wrote:
>>>>>
>>>>> /* snip */
>>>>
>>>> I did a quick first test at work, just a quick compile, got some
>>>> errors, and quite frankly I do not know how this compiled in gcc
>>>> either.  First error is:
>>>> 1>r:\programming_projects\spirit_price\price_parsing\main.cpp(545) :
>>>> error C2373: '_input' : redefinition; different type modifiers
>>>> 1>
>>>>  r:\programming_projects\spirit_price\price_parsing\main.cpp(544)
>>>> : see declaration of '_input'
>>>>
>>>> The relevant code is:
>>>>  template <class T>
>>>>  T
>>>>  extract(char const * & _input, char const * _description,
>>>>     std::string const & _input);
>>>>
>>>> Why do the first and last function params have the same name (_input)?
>>>>  And which one is the real input?  Based upon line 566, I changed the
>>>> last _input to _value and that error (and one other) is now gone.
>>>> Hmm, actually the third error is gone too.  Now I am getting lots of
>>>> Warnings (as errors since I by default have warnings treated as
>>>> errors) about double to int64_t cast, both in  your normal code on
>>>> line 730
>>>>
>>>> Also, I added a:
>>>>  tests.reserve(450000);
>>>> right before the load_tests call, that changed the load_tests time
>>>> from like 10 seconds to about 2 seconds on my system.
>>>>
>>>> Also, why are you using time(0), that only has second accuracy?
>>>>
>>>
>>> The mailing list seems to be taking a very long while to send the
>>> message, so here it is again, but the attachment in the main.cpp file
>>> only, not the test_inputs.dat file (which, when zipped, is over
>>> 350kb).  So get the test_inputs.dat from the link in the post prior to
>>> mine, and use the main.cpp that is attached to this post.  Here is the
>>> message I sent as well, perhaps it will come through eventually:
>>>
>>> Okay, I basically just copy/pasted my thread-safe version of my spirit
>>> parser over and ran it, it returned bad parse with like 13/9 or
>>> something like that.  According to the documentation in the original
>>> cpp file, only "1", "1 2/3", or "1.2" are valid, not "2/3", so I
>>> changed it to support that and ran it again, it parsed successfully
>>> with all numbers in your tests matching successfully.  Here is what it
>>> printed, using the horribly inaccurate time function:
>>> Testing string-based parsing
>>> Testing Xpressive-based parsing
>>> Testing Spirit-based parsing
>>> string parsing: 8s
>>> xpressive parsing: 33s
>>> spirit parsing: 6s
>>>
>>> If you do not mind, I am going to add a millisecond accuracy testing
>>> framework (test.hpp from the boost examples) to the file and change
>>> all the nasty time calls to it for a more reliable reading.
>>
>> OvermindDL1 - does my timer functionality not work?  Can you try using that
>> instead?  If it's no good please let me know - and I can improve it.
>>
>> The whole point of the timer code is to obtain reliable confidence bounds
>> for precisely this kind of optimisation application in an efficient manner.
>>
>> It hurts seeing absolute times used to compare things without any idea of
>> their precision or accuracy!
>
> I have not seen how to use yours yet though, not actually looked at
> the code.  To be honest, I am just lazy and using what I know involves
> one search-and-replace, and two lines of code changed.  >.>
>
> I will make another set with your ejg timer now since I have time for
> once to play a bit.  :)
>
> For now, I made one using the high_resolution_timer.hpp that comes
> with boost for examples and benchmarks and such.  Attached is a zip of
> main.cpp and the high_resolution_timer.hpp (not the test data, it is
> too big to attach quickly).  Here are the results on my computer:
> Testing string-based parsing
> Testing Xpressive-based parsing
> Testing Spirit-based parsing
> string parsing: 7.2406s
> xpressive parsing: 29.2227s
> spirit parsing: 5.07125s
>
> Yea, a lot more accurate, but still not good for direct comparison
> with other people like the ejg timer is, I shall make a modification
> with that next.  :)
>

Ew! Warnings from the ejg files. My build log is even more polluted! ;-)

Made a version of it using the ejg timer now, hope I did it well
enough, mostly just a hack-in since the pre-existing system did not
fit it well, but it compiles and runs and its result is (due note, I
bumped down the default iterations from 100 to 1 so it actually
executes today):
Calibrating overhead......done
Timer overhead (t_c) ~= : 14.6667
Jitter ~= : 0.633371

string vs Xpressive : 296.093 320.578 334.169% faster.
Spirit vs Xpressive : 451.527 456.264 464.239% faster.
Spirit vs string : 25.1096 28.8328 30.6562% faster.

As you can see, string is vastly faster then Xpressive, Spirit is even
more fast then Xpressive as compared to string, and Spirit is slightly
faster (if you consider 28% to be marginally ;-) ) then string.

For those of you unfamiliar with the ejg timer, it calculates certain
factors as you can see first, it then performs multiple tests and
iterations internally that will insure a *very* high degree of
accuracy with testing. The three numbers are, in order, "min med
max". Even looking at the min, Spirit is still over 25% faster then
string (and that number is with an extremely high amount of
statistical confidence). I use Boost.Bind to call the test function
so that adds a bit of overhead, which could be noticeable on the
faster thing like Spirit, so Spirit could potentially be even faster
then the above test indicates. I would need to rewrite the whole
tests to get rid of that restriction, and it may not even be a
restriction, the compiler could have optimized it out, hmm, let me
check the disassembly, nope it is not optimized out, so the tests
could be rewritten better, perhaps I will do that later.

Attached are all the files necessary in a zip, minus the
test_inputs.dat file of course.




Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk