|
Boost : |
Subject: Re: [boost] [Review:Algorithms] Comments about search algorithms
From: dpxguard-boost_at_[hidden]
Date: 2011-09-29 12:26:10
Marshall Clow wrote:
> Unless I'm sadly mistaken (which has been known to happen), we are shifting both in the pattern and the corpus.
> In that code, "match_start" is the place where the match might be, and "idx" is where we want to start comparing.
> If idx == 0, then we start comparing from the beginning of the pattern, but otherwise, from somewhere in the middle.
I recommend you to study carefully this info: http://www-igm.univ-mlv.fr/~lecroq/string/node8.html#SECTION0080
And it will also help to press the "visualization" button and experiment with the applet.
Furthermore I have included bellow a typical C/C++ KMP implementation, which is a slightly modified version of the example presented in the above site. You will clearly see in this sample that the corpus is never shifted.
Best regards
Jim Xochellis
Homepage: https://sites.google.com/site/xochellis/
//*************************************
#include <cstdio>
#include <cstring>
#define XSIZE 256
void KMP(const char *pPattern, int patternSize , const char *pCorpus, int corpusSize);
//------------------------------------------
int main(int argc, char* argv[])
{
const char *pCorpus= "Knu of Knuth!";
const char *pPattern= "Knuth";
KMP(pPattern, strlen(pPattern), pCorpus, strlen(pCorpus));
getchar();
return 0;
}
//------------------------------------------
void preKmp(const char *pPattern, int patternSize , int kmpNext[]) {
int i, j;
i = 0;
j = kmpNext[0] = -1;
while (i < patternSize )
{
while (j > -1 && pPattern[i] != pPattern[j])
j = kmpNext[j];
i++; j++;
if (pPattern[i] == pPattern[j])
kmpNext[i] = kmpNext[j];
else
kmpNext[i] = j;
}
}
void KMP(const char *pPattern, int patternSize , const char *pCorpus, int corpusSize)
{
int patternIdx, corpusIdx, kmpNext[XSIZE];
/* Preprocessing */
preKmp(pPattern, patternSize , kmpNext);
/* Searching */
patternIdx = corpusIdx = 0;
while (corpusIdx < corpusSize)
{
while (patternIdx > -1 && pPattern[patternIdx] != pCorpus[corpusIdx])
patternIdx = kmpNext[patternIdx]; //<--- Shifting the pattern on mismatch
printf("patternIdx: %d -> corpusIdx: %d\n", patternIdx, corpusIdx);
patternIdx++;
corpusIdx++; //<--- corpus is always increased by 1
if (patternIdx >= patternSize )
{
printf("*** Success at: %u\n", corpusIdx - patternIdx);
return; //sucess
}
}
puts("*** failure");
}
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk