Boost logo

Boost :

From: Moore, Paul (Paul.Moore_at_[hidden])
Date: 2000-01-10 08:25:01


From: Dave Abrahams [mailto:abrahams_at_[hidden]]
>
> Funny thing: I always thought of knuth-morris-pratt and
> boyer-moore as search algorithms, not types of patterns,
> as you have defined them. Is pattern type really so intimately
> tied to search algorithm? For example, a regular expression
> might be searched using NFAs or DFAs. These really seem
> like different concepts to me.

I started by agreeing with you. But think about it this way:

1. Search for a substring in a string.
2. Search for a substring *with precomputed KMP stats attached*.
3. Search for a substring *with precomputed BM stats attached*.

Under this interpretation, a string_with_XXX_stats type makes sense
(generally as a temporary object built on the fly at the call site of the
search). Consider Perl's "study" operation, which attaches optimisation
stats (BM, I think) to a string, in the anticipation of doing lots of
searches. If you look at this as creation and retention of an explicit
string_with_BM_stats object, it looks sensible.

Of course, the implementation of

search(str, substr)
search(str, substr_with_BM)
search(str, substr_with_KMP)

looks like 3 distinct algorithms. But that's overloading/specialisation for
you. You're still just searching.

Similarly, NFA_RE and DFA_RE can be viewed as distinct types, which can be
passed to an overloaded search(). In this case, the distinction is stronger,
as you can do things with NFAs that you can't with DFAs (eg, backreferences)
so the constructors differ (NFA(re_str) may be semantically different from
DFA(re_str) for the same re_str).

On reflection, this design is unconventional, but I like it. Then again,
I've only skimmed the document so far.

Paul.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk