Boost logo

Boost Users :

From: Eric Niebler (eric_at_[hidden])
Date: 2006-04-10 22:15:14


Lynn Allan wrote:
> Eric N. wrote during 2005
> > do to make Boyer-Moore work with the regex traits interface make
> > Boyer-Moore more trouble than its worth for case-insensitive matches. I
> > haven't yet run the other tests.
>
> Just curious ... does xpressive use "standard" Boyer-Moore or a variant?
> In my limited experience, the Horspool variant of Boyer-Moore is simpler
> and tends to be faster. It may be more or less prone to have problems
> with "pathological" cases than "standard" Boyer-Moore, however.

Xpressive uses a home-grown variant of Horspool. It's an extension to
the algorithm that allows it to perform case-insensitive matches. It
also works with large character sets, such as Unicode, which standard
Horspool does not, IIUC. It yields potential matches, not exact matches.
Only if it finds a tentative match is a full regex match attepted.

Since I wrote the above, I have fixed the performance problem with BMH
and case-insensitive matches by extended the regex traits class with a
function that returns all the case-folded equivalents of a character.
This resulted in a significant performance improvement for
case-insensitive matches.

HTH,

-- 
Eric Niebler
Boost Consulting
www.boost-consulting.com

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net