Boost logo

Boost :

From: John Maddock (john_at_[hidden])
Date: 2005-05-04 04:42:22


> Is there a way to find out either the string, begin/end iterator or even
> the position of the regex string that created a sub_match?
>
> Lets say I have a regex parser that allows a dynamically defined regex to
> be passed (say, user defined, from a config file).
>
> So A user might pass:
> Bla (.*) blerk (.*)
> ([[:digit:]]+) Bla .* ([[:alpha:]]*)
>
> Or pretty much any regex (actually, in my case, this is not completely
> true, however lets just say for argument's sake ...).
>
> Now assuming another arbitary string is passed to this function, and it
> matches the user-passed regex. How do I find out what the REGEX was that
> matched this particular sub_match (or the position in the passed regex).
>
> For example, if the second regex above was passed, I know by looking at it
> that matches[1] was 'created' by ([[:digit:]]+) and matches[2] was
> 'created' by ([[:alpha:]]*). However programatically, I cannot find out.
>
> Why can sub_match not include a mfirst and msecond parameter (and mstr()
> function), that like first and second (and str()), will point to the
> beginning and end, but of the regex string instead of the 'input' string.

Because no-one has ever asked for it before. If it were to be done the
information would be stored inside the basic_regex object (because it's
strictly a property of the regular expression, so there's no need to clutter
up the sub_match object with that information. However, it would slow down
regex parsing and compilation quite a bit, because it would involve dynamic
memory structures (something I've been trying to eliminate as far as
possible for performance reasons).

Whatever this won't be in 1.33, after that I guess it depends on whether
there is demand from anyone else for this, you seem to be a little unusual
in needing this at present.

John.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk