Boost logo

Boost Users :

From: Paul Davis (pjdavis_at_[hidden])
Date: 2006-08-29 10:52:47


Just a quick note, if you're interested in efficient string searching,
there's some interesting stuff in one of the Topcoder Intel Multi-Threaded
Marathon matches. The competition was called String Search. You've got to
be a member to view the problem statement and see the various solutions, but
its all free.

Anyway, the guy that took first had an interesting use of DSP math in his
algorithm. Might be worth taking a look. Other than that, I'd recommend
adapting one of the exact string matching algorithms to count mismatches.
I'd have to agree that regex probably isn't what you want.

On 8/29/06, Richard Damon <Richard_at_[hidden]> wrote:
>
> I want to point out that regex may not be the right base to start from.
> The
> key thing that I am seeing is ALL of your sample strings to look for have
> been simple strings, with NO "regular expressions" in them (things like
> optional strings, alternates, or repeats). When you include those options
> theanswer you are looking for become not practically computable as there
> may
> be many possible "difference sets" to make the match. Also, how close do
> you
> need? Can I say that I find fun in foobar with a mismatch on the second
> and
> third characters? (if so you will hit a LOT of partial matchs). As has
> been
> pointed out, you need a precise definition of the mismatches allowed and
> then you can work for that. I personally do NOT think regex is apt to give
> you the results you want, because I get the feeling that you are going to
> want to know how close the match is, and regex only really know if it
> matched or not.
>
> Richard Damon.
>
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net