Boost logo

Boost :

From: Vladimir Pozdyayev (ardatur_at_[hidden])
Date: 2004-10-25 03:08:39


Are there any plans to add finite automata-based subengine to the
regex library? If there are, I can submit the algorithm/implementation
of a modified nondeterministic FA which takes care of
   * greedy and lazy quantifiers,
   * eagerness of the alternation operator,
   * non-longest matches (`best' in the Perl-ish sense with respect to
   * substring capturing.
All these features being included, the time cost of a match is
O(N(E+C)L), where N is the number of states, E is the average
epsilon-closure size per state, C is the number of capturing
parentheses, and L is the input length. As far as I know that's the
lowest order of time complexity in this area (corrections are
welcome)---though, of course, deterministic FA would be even faster.

I suppose there's no point in adding another regex library, but being
able to guarantee linear time cost in an existing one would be nice.

Best regards,
Vladimir Pozdyayev.

Boost list run by bdawes at, gregod at, cpdaniel at, john at