|
Boost : |
From: Craig Henderson (cdm.henderson_at_[hidden])
Date: 2002-10-22 02:44:49
"Bohdan" <warever_at_[hidden]> wrote in message
news:ap160b$gid$1_at_main.gmane.org...
> Thanks for links.
> I don't think that using matrix for string comparison is a good idea.
> Memory for matrix should be allocated dynamically, this doesn't allow
> to implement fast algorithm. Besides all it has O(n*n) speed and doesn't
fit
> well
> for two sequence difference ( parameters: 2 pairs of forward iterators ).
Fear not ! My implementation uses linear memory, not a full matrix like
that described in the article. If you're interested, take a look at the
longest_common_subsequence_length() algorithm in the sand-box at
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/boost-sandbox/boost-sandbox/b
oost/sequence_algo/longest_common_subsequence.hpp
The fundamental algo is the same, and just the core of the loop changes for
Levenshtein Distance.
> I think that lavenstain difference should be in boost, but it is not
enough.
There are always alternatives to every algorithm, often suitable for
different datasets. Sure, we should not just have a single implementation to
try to cover all uses.
> More interesting is some kind algorithm with some limitations and better
> perfomance. I'm not ready for more detailed conversation yet. i'll read
theory
> about various edit-differences when i have more time and i hope to propose
> alternative ways.
> But maybe your implementaion alredy doesn't use dynamic memory and is
O(n) ?
> Than i'm wasting time :)
It's never a waste of time to discuss implementation details of algorithms,
IMO :-)
Regards
Craig
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk