Boost logo

Boost :

Subject: Re: [boost] [libboost_regex-1_32.dylib] Validating Email address usingRegx failing in boost
From: OvermindDL1 (overminddl1_at_[hidden])
Date: 2009-10-22 22:26:35


On Thu, Oct 22, 2009 at 1:03 PM, Arun <arun.ka_at_[hidden]> wrote:
> Also i have tried the perl code that is being posted in this thread.
> The output seems comes like this
>
> perl test.pl
> Sequence (?<a...) not recognized in regex; marked by <-- HERE in
> m/^((?>[a-zA-Z\d!#0&'*+\-/=?^_`{|}~]+\x20*|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*"\x20*)*(?<a
> <-- HERE ngle><))?((?!\.)(?>\.?[a-zA-Z\d!#0&'*+\-/=?^_`{|}~]+)+|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*")@(((?!-)[a-zA-Z\d\-]+(?<!-)\.)+[a-zA-Z]{2,}|\[(((?(?<!\[)\.)(25[0-5]|2[0-4]\d|[01]?\d?\d)){4}|[a-zA-Z\d\-]*[a-zA-Z\d]:((?=[\x01-\x7f])[^\\\[\]]|\\[\x01-\x7f])+)\])(?(angle)>)$/
> at test.pl line 2.
>
>
> John, if I compare the output to yours, looks like for me the error is
> being pointed at the 1st instance of "angle" rather than the second
> one.
>
> Is it possible because we may be sing different Operating Systems?
>
> I am using Mac OS X 10.5.5 and
>
> perl -v
>
> This is perl, v5.8.8 built for darwin-thread-multi-2level
> (with 1 registered patch, see perl -V for more detail)
>
>
> -Arun
>
> On Thu, Oct 22, 2009 at 10:08 PM, Arun <arun.ka_at_[hidden]> wrote:
>> Hi John,
>> Thanks for u r reply.
>>
>> I am using the code as below.
>>
>> bool testRegExMatch(std::string aRegex, std::string aTestString)
>> {
>>        boost::regex regExpr(aRegex);
>>
>>        if(regex_match(aTestString,regExpr)==false)
>>        {
>>                return false;
>>        }
>>
>>        return true;
>> }
>>
>> bool validateEmailAddressWithRegx(std::string aEmailAddressStr)
>> {
>>        std::string expr("^((?>[a-zA-Z\\d!#$%&'*+\\-/=?^_`{|}~]+\\x20*|\"((?=[\\x01-\\x7f])[^\"\\\\]|\\\\[\\x01-\\x7f])*\"\\x20*)*(?<angle><))?((?!\\.)(?>\\.?[a-zA-Z\\d!#$%&'*+\\-/=?^_`{|}~]+)+|\"((?=[\\x01-\\x7f])[^\"\\\\]|\\\\[\\x01-\\x7f])*\")@(((?!-)[a-zA-Z\\d\\-]+(?<!-)\\.)+[a-zA-Z]{2,}|\\[(((?(?<!\\[)\\.)(25[0-5]|2[0-4]\\d|[01]?\\d?\\d)){4}|[a-zA-Z\\d\\-]*[a-zA-Z\\d]:((?=[\\x01-\\x7f])[^\\\\\\[\\]]|\\\\[\\x01-\\x7f])+)\\])(?(<angle>)>)$");
>>
>>        cout<<expr;
>>
>>        return testRegExMatch(expr, aEmailAddressStr);
>> }
>>
>> ^((?>[a-zA-Z\d!#$%&'*+\-/=?^_`{|}~]+\x20*|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*"\x20*)*(?<angle><))?((?!\.)(?>\.?[a-zA-Z\d!#$%&'*+\-/=?^_`{|}~]+)+|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*")@(((?!-)[a-zA-Z\d\-]+(?<!-)\.)+[a-zA-Z]{2,}|\[(((?(?<!\[)\.)(25[0-5]|2[0-4]\d|[01]?\d?\d)){4}|[a-zA-Z\d\-]*[a-zA-Z\d]:((?=[\x01-\x7f])[^\\\[\]]|\\[\x01-\x7f])+)\])(?(<angle>)>)$
>> [Session started at 2009-10-22 22:02:58 +0530.]
>> terminate called after throwing an instance of 'boost::bad_expression'
>>  what():  Invalid preceding regular expression
>>
>>
>> Any thing i am missing
>>
>> -Arun
>>
>> But still i am getting the output as
>>
>>
>>
>>
>> On Thu, Oct 22, 2009 at 9:35 PM, John Maddock <john_at_[hidden]> wrote:
>>>> I am using boost regex libraries (libboost_regex-1_32.dylib) for
>>>> validation of strings. The regular expression string i use is as
>>>> below.
>>>>
>>>> string expr =
>>>> "^((?>[a-zA-Z\\d!#$%&'*+\\-/=?^_`{|}~]+\\x20*|\"((?=[\\x01-\\x7f])[^\"\\\\]|\\\\[\\x01-\\x7f])*\"\\x20*)*(?<angle><))?((?!\\.)(?>\\.?[a-zA-Z\\d!#$%&'*+\\-/=?^_`{|}~]+)+|\"((?=[\\x01-\\x7f])[^\"\\\\]|\\\\[\\x01-\\x7f])*\")@(((?!-)[a-zA-Z\\d\\-]+(?<!-)\\.)+[a-zA-Z]{2,}|\\[(((?(?<!\\[)\\.)(25[0-5]|2[0-4]\\d|[01]?\\d?\\d)){4}|[a-zA-Z\\d\\-]*[a-zA-Z\\d]:((?=[\\x01-\\x7f])[^\\\\\\[\\]]|\\\\[\\x01-\\x7f])+)\\])(?(angle)>)$";
>>>>
>>>> When i make a call
>>>>
>>>> boost::regex regExpr(expr);
>>>>
>>>> it is throwing  what():  Invalid preceding regular expression.
>>>
>>> Sigh... I *really* need to improve those error messages :-(
>>>
>>> I tried the expression in Perl and got an error as well:
>>>
>>> $in = 'name.surname_at_[hidden]';
>>> $in =~
>>> /^((?>[a-zA-Z\d!#$%&'*+\-\/=?^_`{|}~]+\x20*|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*"\x20*)*(?<angle><))?((?!\.)(?>\.?[a-zA-Z\d!#$%&'*+\-\/=?^_`{|}~]+)+|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*")@(((?!-)[a-zA-Z\d\-]+(?<!-)\.)+[a-zA-Z]{2,}|\[(((?(?<!\[)\.)(25[0-5]|2[0-4]\d|[01]?\d?\d)){4}|[a-zA-Z\d\-]*[a-zA-Z\d]:((?=[\x01-\x7f])[^\\\[\]]|\\[\x01-\x7f])+)\])(?(angle)>)$/;
>>> print "\n";
>>>  print "\$& = $&\n";
>>>  print "\$1 = $1\n";
>>>  print "\$2 = $2\n";
>>>  print "\$3 = $3\n";
>>>  print "\$4 = $4\n";
>>>  print "\$5 = $5\n";
>>>  print "\$6 = $6\n";
>>>  print "\$7 = $7\n";
>>>  print "\$8 = $8\n";
>>>
>>> Prints:
>>>
>>> Unknown switch condition (?(an in regex; marked by <-- HERE in
>>> m/^((?>[a-zA-Z\d!#0&'*+\-/=?^_`{|}~]+\x20*|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*"\x20*)*(?<angle><))?((?!\.)(?>\.?[a-zA-Z\d!#0&'*+\-/=?^_`{|}~]+)+|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*")@(((?!-)[a-zA-Z\d\-]+(?<!-)\.)+[a-zA-Z]{2,}|\[(((?(?<!\[)\.)(25[0-5]|2[0-4]\d|[01]?\d?\d)){4}|[a-zA-Z\d\-]*[a-zA-Z\d]:((?=[\x01-\x7f])[^\\\[\]]|\\[\x01-\x7f])+)\])(?(
>>> <-- HERE angle)>)$/ at test.pl line 3.
>>>
>>> So Perl and Boost.Regex are both rejecting the "(?(angle)>)" part, and
>>> looking at http://perldoc.perl.org/perlre.html I believe it should be
>>> rejected.  It appears this is a .NET-specific construct :-(
>>>
>>> Changing to "(?(<angle>)>)" seems to make everything work though, in code:
>>>
>>>  boost::regex test (
>>> "^((?>[a-zA-Z\\d!#$%&'*+\\-/=?^_`{|}~]+\\x20*|\"((?=[\\x01-\\x7f])[^\"\\\\]|\\\\[\\x01-\\x7f])*\"\\x20*)*(?<angle><))?((?!\\.)(?>\\.?[a-zA-Z\\d!#$%&'*+\\-/=?^_`{|}~]+)+|\"((?=[\\x01-\\x7f])[^\"\\\\]|\\\\[\\x01-\\x7f])*\")@(((?!-)[a-zA-Z\\d\\-]+(?<!-)\\.)+[a-zA-Z]{2,}|\\[(((?(?<!\\[)\\.)(25[0-5]|2[0-4]\\d|[01]?\\d?\\d)){4}|[a-zA-Z\\d\\-]*[a-zA-Z\\d]:((?=[\\x01-\\x7f])[^\\\\\\[\\]]|\\\\[\\x01-\\x7f])+)\\])(?(<angle>)>)$"
>>> );

I thought one of the correct regex's for email (going strict to the
standard) is at http://ex-parrot.com/~pdw/Mail-RFC822-Address.html and
supports everything except email comments (which would blow that up
more then it already is)?

And yes, I tried your regex, not even javascript's parser was able to
parse it. And what kind of .NET specific crap is that and what is it
supposed to do?


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk