|
Boost : |
From: Eric Niebler (eric_at_[hidden])
Date: 2005-01-12 12:46:14
John Maddock wrote:
>> And, of course, there are additional speed reserves in KMP for search
>> problems.
>
>
> Actually, I think I may remove that code as a dis-optimisation, a simple
> scan for first character is probably faster in the *average* case, and
> of course if you can do a Boyer-search that's even better.
>
> Like you, more research is needed!
>
I have experimented a bit in this area, and you're right. I've had best
results with a three-pronged approach to searches:
(1) Boyer-Moore is a *huge* win if you can do it. xpressive and GRETA
both use BM to great effect when the pattern begins with a string literal.
(2) As a fall-back, use the first character if you can. This handles
patterns like (a...|b.|c..), where I will scan forward for one of {abc}.
(3) Fall back on brute force search at every position.
-- Eric Niebler Boost Consulting www.boost-consulting.com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk