Boost logo

Boost :

Subject: Re: [boost] [xpressive] Performance Tuning?
From: OvermindDL1 (overminddl1_at_[hidden])
Date: 2009-07-18 20:44:05


On Sat, Jul 18, 2009 at 9:15 AM, Eric Niebler<eric_at_[hidden]> wrote:
> OvermindDL1 wrote:
>>
>> To be honest, I had to change the core::to_number lines (commented
>> out) to boost::lexical_cast (right below the commented version), so
>> the xpressive version could be slightly faster if I actually had the
>> implementation of core::to_number available, and core::to_number was
>> well made.
>
> This could very well be the source of a major slow-down. Doesn't
> lexical_cast use string streams to do the conversion? It seems to me that
> you're comparing apples to oranges.

Yes, as I complained about multiple times in this thread about people
not posting complete code snippets, what am I supposed to make of a
function call that does not exist?

On Sat, Jul 18, 2009 at 1:51 PM, John Bytheway<jbytheway+boost_at_[hidden]> wrote:
> OvermindDL1 wrote:
> <snip>
>> As stated, I have heard that Visual Studio handles template
>> stuff like Spirit better then GCC, so I am very curious how GCC's
>> timings on this file would be.
>
> Alas, gcc doesn't do so well.  I had to make a few tweaks to your code
> (you typedefed int64_t at global scope which clashes with the one in the
> C library headers, and you used an INT64 macro which doesn't exist here)
> but then I got a very long error ending with this:
>
> .../boost-trunk/boost/proto/transform/call.hpp:146: internal compiler
> error: Segmentation fault
>
> I guess the metaprogramming is too much for it :(.
>
> That was with -O3 -DNDEBUG -march=native and gcc version:
> gcc (Gentoo 4.3.3-r2 p1.2, pie-10.1.5) 4.3.3

I do not get that, GCC usually handles more templates then MSVC ever
has, just usually not as optimized, so I do not understand how you
could be getting a compiler error.

On Sat, Jul 18, 2009 at 1:51 PM, John Bytheway<jbytheway+boost_at_[hidden]> wrote:
> So then I tried icc 10.1 (essentially same options) which takes over a
> minute to compile this, but does succeed.  With that I got:
>
> $ ./price-icc
> Loop count:  10000000
> Parsing:  42.5
> xpressive:  27.4704
> spirit-quick(static):  1.58132
> spirit-quick_new(threadsafe):  1.52971
> spirit-grammar(threadsafe/reusable):  1.64666
>
> which are much the same as your results (except ~1.7 times faster all
> round), but the Parsing result is obviously meaningless and the
> xpressive also dubious because of lexical_cast.
>
> I then tried with icc's inter-procedural optimisations turned on too,
> which improves the xpressive code significantly, but doesn't obviously
> affect spirit:
>
> $ ./price-icc-ipo
> Loop count:  10000000
> Parsing:  42.5
> xpressive:  17.3577
> spirit-quick(static):  1.52487
> spirit-quick_new(threadsafe):  1.51834
> spirit-grammar(threadsafe/reusable):  1.65164
>
> Finally I used static linking, and the xpressive time improved again,
> and maybe the others a little.  This surprised me.
>
> $ ./price-icc-ipo-static
> Loop count:  10000000
> Parsing:  42.5
> xpressive:  12.6157
> spirit-quick(static):  1.49887
> spirit-quick_new(threadsafe):  1.48146
> spirit-grammar(threadsafe/reusable):  1.62731

Regardless, all of these numbers and times are vastly higher then what
the previous person posted, so very nice. We just need the compilable
original code to see how it compares now.

Hmm, I might try to replace all the lexical_cast's with a spirit
parser for just that number, for a single extraction like that, Spirit
compiles to *very* little assembly, quite impressive actually.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk