Boost logo

Boost Users :

From: John Maddock (john_at_[hidden])
Date: 2006-04-29 07:34:11


>> i am using boost regex to check the systax of my URLs.
>> if i am giving a long URL then it is giving the following error:-
>>
>> terminate called after throwing an instance of
>> 'boost::bad_expression' what(): Memory exhausted
>> Aborted

There is a runtime santity check inside the regex matcher that results in an
exception being thrown if the complexity of the attempted match exceed a
certain limit. It's there to prevent you from shooting yourself in the foot
and writing an expression that takes *forever* to match.

It's possible to up the limit that triggers the exception by changing the
macros in boost/regex/user.hpp, but really you need to look hard at the
regex you're using and try and optimise it a bit more.

Problems normally occur when you have something that looks like:

(x*)*

In your case I notice you have a tripply recursive loop like this in the
fragment:

((([a-zA-Z0-9]+-?)?[a-zA-Z0-9]+)+[.]{1}[a-zA-Z0-9]+)+

First off the fragment [.]{1} is just the same as [.] isn't it?

The matcher is seeing this as being to all intents and purposes the same as:

(([a-zA-Z0-9-.]+)+)+

which is what causes the regex to "thrash" when trying to find a match. I
assume you want to assert that . and - are always followed by one or more
[a-zA-Z0-9] characters?

So how about:

([a-zA-Z0-9]|[.-](?=[a-zA-Z0-9]))+

For this whole fragment?

Hope this helps, John.

>> #include<iostream>
>> #include <boost/regex/v4/regex.hpp>
>>
>> bool EvaluateRegex(const std::string regularexp,std::string url){
>> boost::smatch what;
>> boost::regex exp(regularexp,boost::regex::extended);
>> boost::regex_search(url,what,exp);
>> std::string search=(std::string)what[0];
>> if(search.length() == url.length())
>> return true;
>> else
>> return false;
>> }
>>
>> int main()
>> {
>> std::string url;
>> const std::string URLSYNTAXREGEX =
>>
>>
>>
>>
>>
>>
>>
>> "((https?|ftp)://)?(((([a-zA-Z0-9]+-?)?[a-zA-Z0-9]+)+[.]{1}[a-zA-Z0-9]+)+([:]{1}((0[0-7]+)|(0[xX][0-9a-fA-F]+)|([0-9]+))){0,1})([?/][-a-zA-Z0-9+&@#/%?=~_|!:,.;
>> ]*)?"; std::cout<<"enter the URL"<<std::endl; std::cin>>url;
>> if(!EvaluateRegex(URLSYNTAXREGEX,url)) std::cout<<"Malformed
>> URL"<<std::endl; else std::cout<<"Well formed URL"<<std::endl;
>> return 0; }
>>
>>
>> ---------------------------------
>> Love cheap thrills? Enjoy PC-to-Phone calls to 30+ countries for
>> just 2¢/min with Yahoo! Messenger with Voice.
>
>
>
>> _______________________________________________
>> Boost-users mailing list
>> Boost-users_at_[hidden]
>> http://lists.boost.org/mailman/listinfo.cgi/boost-users


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net