|
Boost : |
From: Bohdan (warever_at_[hidden])
Date: 2002-10-20 14:18:17
"Craig Henderson" <cdm.henderson_at_[hidden]> wrote in message
news:aopspu$98a$2_at_main.gmane.org...
>
> "Bohdan" <warever_at_[hidden]> wrote in message
> news:aopcum$r95$1_at_main.gmane.org...
> > "Pavol Droba" <droba_at_[hidden]> wrote in message
> > news:20021018152452.B8358_at_lenin.felcer.sk...
>
> > I have string/sequence distance algorithm, i.e. min numbrer of
> > inserts+replaces+deletes
> > that should be applied to first string string/sequence to obtain second
> > string/sequence.
> > Such function/algorithm is useful when comparing two strings with errors.
> > Also i'm interested in ltrim, rtrim, trim and different encoding
> conversions.
>
> I also have an Edit (Levenshtein) Distance implementation which is based on
> my LCS Length algorithm that is already in the sandbox in sequence_algo. I
> adapted LLCS to an Edit Distance implementation after discussion with
> Philippe Lalande, and could add it to the sandbox to begin discussion if
> you're interested.
I was looking for a theory on such kind of algorithm some
time ago, but first ~20 links from google gave me only definition of "string
distance",
no algorithms :(. So i could not prove that my implementation really gives min
count of operations to transform string. The next hole is that i do not know
if there is more effective algorithm than my ... no doubts it exists :).
We used string difference in UDF (user defined library) on SQL server side,
namely for comparing person address considering possible errors.
Speed requirements for such function were really high.
I'd like to look at your/Levenstain/Lalande implementaion and
i will appreciate if you give me few links.
regards,
bohdan.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk