Boost logo

Boost Users :

Subject: Re: [Boost-users] pathological regex?
From: OvermindDL1 (overminddl1_at_[hidden])
Date: 2009-08-27 20:08:36


2009/8/27 Micha³ Nowotka <mmmnow_at_[hidden]>:
> Ok,
> first of all this isn't answer to my question.
> Secondly - do you have any performance test to campere these libraries?

I cannot answer if that is pathological or not as I do not know regex
that well, but you did mention performance issues, and Spirit2.1 is
faster then regex in just about every possible way. Spirit2.1 creates
a parse tree at compile time that is fully optimized for what you are
wanting to parse, and thus far it is faster at every testing thing,
from optimized old price parsers to even being faster then the
built-in atoi and similar kin. I do not know the regex syntax
perfectly, but based on
"(https?://)?([^/@]*[\\.@])?google.com[/\\?&#]*", a similar Spirit2.1
parser might be:

   -("http" >> -lit('s') >> "://") // equal to the (https?://)? part above
>> -(*char_("^/@") >> char_(".@")) // equal to the ([^/@]*[\\.@])? part above
>> "google.com"
>> *char_("/?&#") // equal to the [/\\?&#]* part above

That would create the Spirit2.1 parse tree at compile time, and if you
wished it could parse out anything using their proper type faster then
anything else that has yet been tested. But yes, Spirit2.1 does
contain a few benchmarks with it in the docs, and the spirit mailing
lists contain a rather large hoard of a lot more of other tests that
people have done themselves.

The latest Spirit2.1 docs are at:
http://svn.boost.org/svn/boost/trunk/libs/spirit/doc/html/index.html
The docs are consistently being updated, they are the only part of
Spirit2.1 not yet complete, but are being worked on all the time. I
need to help with that... >.>


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net