Boost logo

Boost :

Subject: Re: [boost] [gsoc 2013] Approximate string matching
From: Michael Marcin (mike.marcin_at_[hidden])
Date: 2013-04-12 13:50:43

On 4/12/13 4:52 AM, Jan Strnad wrote:
> Hi, I'm Jan Strnad, currently a Master's student at CTU in Prague, Czech
> Republic. This summer, I would like to participate in GSOC and I believe
> that approximate string matching is a good match for me, since my
> background i this area is quite strong.
> Now more specifically. I would like to implement approximate string
> matching algorithm(s) based on the NFA. I would like to support various
> approximate distances, such as Hamming distance [1], Levenstein distance
> [2], Damerau-Levenstein distance [3], Delta distance, Gamma distance and
> finally (delta, gamma) distance. Sorry no explanation of the last three
> found on the Web, but I can provide one if interested.
> In the matter of NFA implementation, I would probably choose dynamic
> programming approach, but I'm thinking of shift-or algorithm as well.
> Can I ask you for any kind of thoughts or feedback regarding this topic?
> Best regards,
> Jan Strnad
> [1]
> [2]
> [3]

Sounds very useful!

I guess this would fit in the string algorithm library?

What would the interface look like?

Is it as simple as:

namespace boost {
namespace algorithm {

template< typename Range1T, typename Range2T >
std::size_t levenstein_distance( Range1T first, Range2T second );


Boost list run by bdawes at, gregod at, cpdaniel at, john at