Boost logo

Boost :

From: Vladimir Pozdyayev (ardatur_at_[hidden])
Date: 2004-10-25 03:08:39


Hello.

Are there any plans to add finite automata-based subengine to the
regex library? If there are, I can submit the algorithm/implementation
of a modified nondeterministic FA which takes care of
   * greedy and lazy quantifiers,
   * eagerness of the alternation operator,
   * non-longest matches (`best' in the Perl-ish sense with respect to
     greediness/laziness/eagerness),
   * substring capturing.
All these features being included, the time cost of a match is
O(N(E+C)L), where N is the number of states, E is the average
epsilon-closure size per state, C is the number of capturing
parentheses, and L is the input length. As far as I know that's the
lowest order of time complexity in this area (corrections are
welcome)---though, of course, deterministic FA would be even faster.

I suppose there's no point in adding another regex library, but being
able to guarantee linear time cost in an existing one would be nice.

-- 
Best regards,
Vladimir Pozdyayev.

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk