Boost logo

Boost Users :

Subject: Re: [Boost-users] Problems matching parantheses in a string using boost::regex 1.42.0 on a amd64 debian system
From: Bill Buklis (boostusr_at_[hidden])
Date: 2011-08-22 13:11:36


On 8/22/2011 11:56 AM, Simon Hoerder wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA224
>
> Hi,
>
> I'm fairly new to boost libraries but searched the documentation and
> googled for an answer to my problem but couldn't come up with something.
> Please bear with me.
>
> I am using boost::regex (version 1.42.0 on debian amd64 machine) to
> parse a vhdl-like file. To this end, I am trying to use the regular
> expression
> boost::regex("^SIGNAL ([A-Z0-9_]*?): STD_LOGIC_VECTOR\( ([0-9]*?) DOWNTO
> ([0-9]*?)\);$");
> to match strings like:
> SIGNAL W07: STD_LOGIC_VECTOR( 255 DOWNTO 0);
>
> My problem is that matching those parantheses in the string doesn't work
> - - I've tested the regexp under perl and it works fine but boost::regexp
> and C++ works only if I remove the brackets from the string and from the
> regular expression.
>
> Issues I observed& tested:
> 1) '\(' ... '\)' no match but g++ complains about an unknown escape
> sequence '\)'.
> 1a) Using '\(' ... ')' doesn't match. (I would have expected an
> exception to be thrown but none whatsoever.)
> 2) '0x28' ... '0x29' doesn't match. (0x28 = ASCII '(', 0x29 = ASCII ')')
> 3) '\Q(\E' ... '\Q)\E' no match but g++ complains about an unknown
> escape sequence '\Q'.
> 4) If I try to match only one of the brackets and remove the other from
> the string& regex I get one of the following:
> '\(' ... gives me:
> terminate called after throwing an instance of
> 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::regex_error>
>> '
> what(): Found a closing ) with no corresponding openening
> parenthesis. The error occured while parsing the regular expression
> fragment: '?([0-9]*?)>>>HERE>>>);$'.
> ... '\)' gives me:
> terminate called after throwing an instance of
> 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::regex_error>
>> '
> what(): Unmatched marking parenthesis ( or \(. The error occured
> while parsing the regular expression fragment: '[0-9]*?);$>>>HERE>>>'.
> 5) Even
> boost::regex("^SIGNAL W07: STD_LOGIC_VECTOR\( 255 DOWNTO 0\);$");
> fails to match
> SIGNAL W07: STD_LOGIC_VECTOR( 255 DOWNTO 0);
>
> Points 1a and 4 suggests that boost::regex treats even escaped
> parantheses as special characters which is quite surprising to me. I am
> sure a lot of people successfully used boost::regex to match parantheses
> so I do wonder which part of the documentation I have missed.
>
> I'd be happy to provide more debugging information if you tell me how to
> produce it.
>
> Many thanks,
> Simon Hoerder
>

You need to escape the slash character for normal C++ processing, i.e.
double slash.
It should be "\\(" and "\\)" to catch a parentheses.

-- 
Bill

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net