Boost logo

Boost Users :

From: Daniel Yerushalmi (yg-boost-users_at_[hidden])
Date: 2002-12-06 03:09:29


I think that if you have to parse very complicated expressions you should
try to match it using spirit (this is also boost candidate library). This
library use "recursive descent parser" instead of regular exprssion.
(and is very convinet - at least for me)

look at: http://spirit.sourceforge.net
Regards
  Daniel

"eydelber" <eydelber_at_[hidden]> wrote in message
news:asnqi6+oea6_at_eGroups.com...
> --- In Boost-Users_at_y..., "John Maddock" <john_maddock_at_c...> wrote:
> > > In this case, you promise that textdb::TextDB::execute will only
> throw
> > > TextDBException. However, RegEx::Match is thowing a different
> exception.
> > > You are not handling this exception in textdb::TextDB::execute,
> and you
> > > have promised not to leak it; the result is that GCC causes your
> program
> > > to abort.
> > >
> > > By removing the throw specification from your function and
> adding extra
> > > catches in main(), I can tell you that RegEx::Match is throwing
> an
> > > exception whose what() reports "Max regex search depth exceeded."
> > >
> > > I hope this helps.
> >
> > Nice work! Thankyou :-)
> >
> > It seems like the expression is getting pathological with some
> text inputs
> > and throwing (otherwise it would just go round and round
> indefinitely, so
> > throwing is the least worst option in this case).
> >
> > Looking again at your expressions I see:
> >
> > /* 13 */ "insert( into)? (_ID_)( \\(((\\s*,?\\s*(_ID_)\\s*)+)
> \\))?
> > values \\(((\\s*,?\\s*((\\d+(\\.\\d+)?)|(\"[^\"]*\"))\\s*)+)\\)
> >
> > Now I haven't picked this apart, the trick is to ensure that for
> each time
> > the matcher has to choose which option to take (repeat or not, take
> > alternative or not) that there is only one option it can take -
> whatever the
> > regex engine in use this will optimise performance - and for
> backtracking
> > engines it will prevent pathological behaviour. To pick just one
> example in
> > your expression:
> >
> > \\s*,?\\s*
> >
> > this will misbehave if there is a lot of whitespace and no ",",
> changing to:
> >
> > \\s*(,\\s*)?
> >
> > fixes the issue.
> >
> > elsewhere several of your repeats both start and finish with \\s*,
> so again
> > there is plenty of room for optimisations.
> >
> > Hope this gets you started,
> >
> > John Maddock
> > http://ourworld.compuserve.com/homepages/john_maddock/index.htm
>
> Thank you very much, that is most likely the problem then. I'll try
> that and if I'm still having problems I'll post back here.
>
> The point you make is interesting however, because if you look at
> the first Regex:
>
> create table (_ID_) (\\(((\\s*,?\\s*_COL_\\s*)+)\\))
>
> It is essentially the same, but I have had no problems at all with
> it. Is it then safe to assume that the problem does not lie
> specifically with \s*,\s*, but somewhere deeper. Here is the actual
> regex that finally gets used (after the call to replaceMacros):
>
> ^(\s*insert(\s+into)?\s+([\w\-]+)(\s+\(((\s*(?:,?\s*)?([\w\-]+)\s*)+)
> \))?\s+values\s+\(((\s*,?\s*((\d+(\.\d+)?)|("[^"]*"))\s*)+)\)\s*;?)$
>
> I'm guessing it is something with:
>
> (\s+\(((\s*(?:,\s*)?([\w\-]+)\s*)+)\))?
>
> This one has a few more groupings than all the other regexs, but I
> can't seem to figure out the problem.
>
> BTW, during the course of writing this reply, I tested the
> expression with \s*(,\s*)? (as you can see above), and it still
> throws an exception.
>
>
>
> Info: <http://www.boost.org>
> Wiki: <http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl>
> Unsubscribe: <mailto:boost-users-unsubscribe_at_[hidden]>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
>


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net