Boost logo

Boost Users :

From: eydelber (eydelber_at_[hidden])
Date: 2002-12-05 10:15:18


--- In Boost-Users_at_y..., "John Maddock" <john_maddock_at_c...> wrote:
> > In this case, you promise that textdb::TextDB::execute will only
throw
> > TextDBException. However, RegEx::Match is thowing a different
exception.
> > You are not handling this exception in textdb::TextDB::execute,
and you
> > have promised not to leak it; the result is that GCC causes your
program
> > to abort.
> >
> > By removing the throw specification from your function and
adding extra
> > catches in main(), I can tell you that RegEx::Match is throwing
an
> > exception whose what() reports "Max regex search depth exceeded."
> >
> > I hope this helps.
>
> Nice work! Thankyou :-)
>
> It seems like the expression is getting pathological with some
text inputs
> and throwing (otherwise it would just go round and round
indefinitely, so
> throwing is the least worst option in this case).
>
> Looking again at your expressions I see:
>
> /* 13 */ "insert( into)? (_ID_)( \\(((\\s*,?\\s*(_ID_)\\s*)+)
\\))?
> values \\(((\\s*,?\\s*((\\d+(\\.\\d+)?)|(\"[^\"]*\"))\\s*)+)\\)
>
> Now I haven't picked this apart, the trick is to ensure that for
each time
> the matcher has to choose which option to take (repeat or not, take
> alternative or not) that there is only one option it can take -
whatever the
> regex engine in use this will optimise performance - and for
backtracking
> engines it will prevent pathological behaviour. To pick just one
example in
> your expression:
>
> \\s*,?\\s*
>
> this will misbehave if there is a lot of whitespace and no ",",
changing to:
>
> \\s*(,\\s*)?
>
> fixes the issue.
>
> elsewhere several of your repeats both start and finish with \\s*,
so again
> there is plenty of room for optimisations.
>
> Hope this gets you started,
>
> John Maddock
> http://ourworld.compuserve.com/homepages/john_maddock/index.htm

Thank you very much, that is most likely the problem then. I'll try
that and if I'm still having problems I'll post back here.

The point you make is interesting however, because if you look at
the first Regex:

create table (_ID_) (\\(((\\s*,?\\s*_COL_\\s*)+)\\))

It is essentially the same, but I have had no problems at all with
it. Is it then safe to assume that the problem does not lie
specifically with \s*,\s*, but somewhere deeper. Here is the actual
regex that finally gets used (after the call to replaceMacros):

^(\s*insert(\s+into)?\s+([\w\-]+)(\s+\(((\s*(?:,?\s*)?([\w\-]+)\s*)+)
\))?\s+values\s+\(((\s*,?\s*((\d+(\.\d+)?)|("[^"]*"))\s*)+)\)\s*;?)$

I'm guessing it is something with:

(\s+\(((\s*(?:,\s*)?([\w\-]+)\s*)+)\))?

This one has a few more groupings than all the other regexs, but I
can't seem to figure out the problem.

BTW, during the course of writing this reply, I tested the
expression with \s*(,\s*)? (as you can see above), and it still
throws an exception.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net