Eugene M. Kim wrote:
I am looking for a way to feed the input bytes only once, in order to
reduce the time complexity to approximately O(n). Of course I will lose
the ability to recover submatches, but the ability is unnecessary for
this particular application that I am considering. I cannot break the
regex into two parts ("omg" and "wtf") either because it is specified as
an input to the program.
Hi,
It may sound silly but what if you pre-process the input splitting it by
".*" and do an AND match for all sub-strings?
input = "omg.*wtf" => input[] = "omg", "wtf";
return (regexp_match(input[0]) && regexp_match(input[1]));
cheers,
--renato
PS: of course the code above is pseudo-code and of course you'll take as
many input parameters as available and not hard-coded like this.
That would work for simple cases where the two sub-patterns are
separated by ".*", but fail for other non-trivial expressions such as
"omg(bbq|cakes)*wtf", which the program must also be able to accept.
;_;