Boost logo

Boost :

From: Eric Niebler (eric_at_[hidden])
Date: 2004-10-25 18:13:20

Vladimir Pozdyayev wrote:
> Hello.
> Are there any plans to add finite automata-based subengine to the
> regex library? If there are, I can submit the algorithm/implementation
> of a modified nondeterministic FA which takes care of
> * greedy and lazy quantifiers,
> * eagerness of the alternation operator,
> * non-longest matches (`best' in the Perl-ish sense with respect to
> greediness/laziness/eagerness),
> * substring capturing.
> All these features being included, the time cost of a match is
> O(N(E+C)L), where N is the number of states, E is the average
> epsilon-closure size per state, C is the number of capturing
> parentheses, and L is the input length. As far as I know that's the
> lowest order of time complexity in this area (corrections are
> welcome)---though, of course, deterministic FA would be even faster.
> I suppose there's no point in adding another regex library, but being
> able to guarantee linear time cost in an existing one would be nice.

Interesting. You say it supports substring capturing. Does it also
support backreferences? By which I mean, can you refer to the captured
substrings from within the pattern itself, as in the perl regex /(.)\1/?

Eric Niebler
Boost Consulting

Boost list run by bdawes at, gregod at, cpdaniel at, john at