Boost logo

Boost :

Subject: Re: [boost] [Review:Algorithms] Comments about search algorithms
From: dpxguard-boost_at_[hidden]
Date: 2011-09-28 13:05:30


Oh, and something important which I forgot to mention: Bad-character shifting requires the presence of random-access iterators in order to work efficiently. Consequently, since Boyer-Moore and Boyer-Moore-Horspool are using bad-character shifting, they are only efficient for random-access iterators. Hence KMP will be still useful, even if it is much slower in the most common cases. And a question for Marshall: I have noticed that inside the search.hpp file you have a requirement for random-access iterators regarding the Knuth-Pratt-Morris searching algorithm as well. Is this for both the pattern and the corpus? Phil Endecott wrote: > How efficiently could the current algorithms be used with DNA data?   > Does anyone have any experience with a packed base-4 representation? May I propose an easy way: Why not creating random sequences, using only four char values (ie "abcd") This will easily give you a very good idea about the KMP performance. For a more elaborate aproach, you can always use the tests of Musser & Nishanov: http://www.cs.rpi.edu/~musser/gp/gensearch/index.html > Phil Endecott wrote: > > However, in my opinion the most interesting and useful property of  > > the KMP algorithm is not its performance, but its ability to help in  > > building very generic sequence searching algorithms. In fact KMP has  > > helped me to build an efficient generic sequence searching algorithm,  > > which actually works with ranges accessed by single-pass (input)  > > iterators. (See http://www.codeproject.com/KB/stl/single_pass_search.aspx > > and also in boost-vault/Algorithms) Needless to say that I would be  > > more than happy to submit this work to boost, whenever there is an  > > interest. > I am reminded that Spirit has some sort of adaptor that allows an input  > iterator to be used with a grammar that requires backtracking by  > storing recent inputs in a buffer.  Presumably something like this  > could be used here.  How much does your algorithm gain over a  > combination of such a buffer and one of Marshall's algorithms? Very good idea! I will try to compare my algorithm against this adaptor. A very interesting benchmark for me thank you Phil. Best regards Jim Xochellis Homepage: https://sites.google.com/site/xochellis/


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk