Boost logo

Boost :

Subject: Re: [boost] [gsoc 2013] Approximate string matching
From: Jan Strnad (hanny.strnad_at_[hidden])
Date: 2013-04-13 07:45:18


Dne 13.4.2013 10:41, Bjorn Reese napsal(a):
> On 04/12/2013 11:52 AM, Jan Strnad wrote:
>
>> In the matter of NFA implementation, I would probably choose dynamic
>> programming approach, but I'm thinking of shift-or algorithm as well.
>
> You may also want to look into Thomson NFAs:
>
> http://swtch.com/~rsc/regexp/regexp1.html
>
>
When I was writing about NFA I actually meant Thomson like NFA -- no
backtracking required.

I still believe that dynamic programming approach is the best: its
complexity is low (O(m*n), m is length of pattern, n is length of a
string) and so are memory requirements (e.g. for Hamming distance k it
is O(k*m)).

There is another area where NFA can be used: finding (approximate)
repetitions in a string. However, I'm not sure how useful/desired that
functionality is.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk