Boost logo

Boost :

Subject: Re: [boost] [Review:Algorithms] Comments about search algorithms
From: dpxguard-boost_at_[hidden]
Date: 2011-09-28 16:45:44


> > And a question for Marshall: I have noticed that inside the search.hpp file you have a requirement for random-access iterators regarding the Knuth-Pratt-Morris searching algorithm as well. Is this for both the pattern and the corpus? > Yes, and the here's the reason for that. > There are basically two steps to the KMP algorithm (and frankly, the other two as well), and those two steps are coded in the do_search method. > Given a potential match point (i.e position in the pattern and the corpus): > 1)    test and advance until you find a mismatch or come to the end (and report success) > 2)    Find the next potential match point. > Step #1 can be done with ForwardIterators (or Bidirectional, in the case of Boyer-Moore), but in the second step, you have to be able to advance to arbitrary location in both the pattern and the corpus: >           //  Figure out where to start searching again > 544:          match_start += idx - skip_ [ idx ]; > 545:          idx = skip_ [ idx ] >= 0 ? skip_ [ idx ] : 0; > In this case, idx is the position in the pattern. > Hm.. if the skip table stored iterators instead of indexes, you could just use ForwardIterators..... Using iterators instead of indexes is not a bad idea, but even with indexes, what is the reason for needing random-access iterators for the corpus? Typically in good-suffix shift we are shifting the pattern and not the corpus! I will try to study your code asap in order to become more specific. I ensure you that requiring no more than forward iterators for both the pattern and the corpus is both feasible and important. And it is particularly important, since in the random-access iterators case BM will almost always be faster, hence the usefulness of a KMP implementation, which requires random-access iterators, is rather limited. Best regards Jim Xochellis Homepage: https://sites.google.com/site/xochellis/


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk