Boost logo

Boost :

Subject: Re: [boost] [algorithm] Review Request: edit_distance
From: Erik Erlandson (eje_at_[hidden])
Date: 2014-02-20 21:41:33


>
> > - Have you looked at the supposedly better algorithm "An O(NP)
> > Sequence Comparison Algorithm" by Sun Wu, Udi Manber, and Gene
> > Myers ?
>

I've been doing more cross-testing and benchmarking against the current O(ND) version. The latest O(NP) prototype is passing randomized cross-testing and is now about 25% faster than the O(ND) baseline.

The latest code now looks like this:

https://github.com/erikerlandson/algorithm/blob/order_np_alg/include/boost/algorithm/sequence/detail/edit_distance.hpp#L342

To actually get it performing faster than O(ND), I had to reorganize the looping over the diagonals so it proceeds outside-in, and derive adaptive looping bounds as a function of P and current-best distance. In the paper, Myers et al claim more like a 50% improvement, but I'm not sure if their experiments were comparing the two uni-directional variations. It should use closer to 50% less memory, but I haven't done measurements.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk