Just a quick note, if you're interested in efficient string searching, there's some interesting stuff in one of the Topcoder Intel Multi-Threaded Marathon matches.  The competition was called String Search.  You've got to be a member to view the problem statement and see the various solutions, but its all free.

Anyway, the guy that took first had an interesting use of DSP math in his algorithm. Might be worth taking a look.  Other than that, I'd recommend adapting one of the exact string matching algorithms to count mismatches.  I'd have to agree that regex probably isn't what you want.

On 8/29/06, Richard Damon <Richard@damon-family.org> wrote:
I want to point out that regex may not be the right base to start from. The
key thing that I am seeing is ALL of your sample strings to look for have
been simple strings, with NO "regular expressions" in them (things like
optional strings, alternates, or repeats). When you include those options
theanswer you are looking for become not practically computable as there may
be many possible "difference sets" to make the match. Also, how close do you
need? Can I say that I find fun in foobar with a mismatch on the second and
third characters? (if so you will hit a LOT of partial matchs). As has been
pointed out, you need a precise definition of the mismatches allowed and
then you can work for that. I personally do NOT think regex is apt to give
you the results you want, because I get the feeling that you are going to
want to know how close the match is, and regex only really know if it
matched or not.

Richard Damon.

_______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users