Boost logo

Boost :

Subject: Re: [boost] [xpressive] Performance Tuning?
From: Stewart, Robert (Robert.Stewart_at_[hidden])
Date: 2009-07-02 08:17:10


Eric Niebler wrote:
> Dave Jenkins wrote:
> > Robert Stewart wrote:
> >> Whoa! The performance just shot up to a mere 6X the custom
> >> code (from 175X). That might well be fast enough to keep
> >> the Xpressive version because of its readability!
> >
> > Can you create thread-local match_results objects and reuse
> > them? If so, I think you'll see parsing dwindle to almost
> > nothing and your semantic actions will account for the bulk
> > of the time spent.

At Dave's suggestion, I tried that.

> Dave, thanks for spotting the obvious perf problem I missed. I can
> confirm that reusing the match results object largely eliminates the
> remaining performance problem. I tried 3 different scenarios:
>
> 1) The original code
> 2) Static const regexes
> 3) Static const regexes with reused match results objects
>
> I ran each config for 1000000 iterations and got roughly
> these numbers:
>
> 1) ~950 sec
> 2) ~45 sec
> 3) ~9 sec
>
> So reusing the match results object (3) results in a 5x speedup
> over just using static const regexes (2). That almost
> completely erases any performance advantage of the hand-crafted
> parsing code.

Unfortunately, my results don't bear that out with the added overhead of locating the smatch via thread local storage. I see no difference between default constructing an smatch each time and reusing an instance from TLS.

> I'll also point out this section of the docs:
> http://www.boost.org/doc/libs/1_39_0/doc/html/xpressive/user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks
>
> Both of the above optimizations (reuse regexes and
> match_results objects) are recommended there.

I read through some of those quickly upon an initial read through the documentation. I chose static regexes because I understood them to be faster. I did not reuse the match_results<> because they aren't thread safe. Dave's TLS suggestion hadn't occurred to me to reconcile the competing forces of reuse and no thread safety, of course.

Upon rereading, I see that you note that reusing a static regex improves performance, but I'd forgotten it by the time I needed the information. May I suggest a "Performance Tuning" section that discusses such things apart from common pitfalls, etc. and stands out better in the TOC? I didn't expect to find such information in a "Tips and Tricks" section, and forgotten that I had seen it when it was wanted.

_____
Rob Stewart robert.stewart_at_[hidden]
Software Engineer, Core Software using std::disclaimer;
Susquehanna International Group, LLP http://www.sig.com

IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk