Boost logo

Boost :

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2007-05-12 08:30:30


JD wrote:
> I would like to know if there is any interest in an implementation of
> the edit distance algorithm

I have an implementation of the "diff" (i.e. edit script) algorithm here:

http://svn.anyterm.org/anyterm/trunk/common/diff.hh
http://svn.anyterm.org/anyterm/trunk/common/diff.cc

The "edit distance" is in there somewhere.

I would welcome this sort of thing in Boost. But it needs to be the
best-known algorithm. IIRC, there was one improvement that I found in
the literature but didn't implement because I didn't fully understand it.

Please correct me if I'm wrong, but I get the impression that your code
is O(NM) where N and M are the sizes of the two input strings. The
algorithm that I implemented is O((N+M)D) where D is the edit distance
in the worst case, and typically O(N+M+D^2). The algorithm is
described in the .cc file linked above. Also, your code seems to have
O(NM) space requirement; I'm not sure what the lower bound is, but I
think it's probably O(N+M).

Regards,

Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk