Boost logo

Boost Users :

Subject: Re: [Boost-users] Problems matching parantheses in a string using boost::regex 1.42.0 on a amd64 debian system
From: Simon Hoerder (simon_at_[hidden])
Date: 2011-08-22 13:28:58


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA224

On 22/08/11 18:11, Bill Buklis wrote:
> On 8/22/2011 11:56 AM, Simon Hoerder wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA224
>>
>> Hi,
>>
>> I'm fairly new to boost libraries but searched the documentation and
>> googled for an answer to my problem but couldn't come up with something.
>> Please bear with me.
>>
>> I am using boost::regex (version 1.42.0 on debian amd64 machine) to
>> parse a vhdl-like file. To this end, I am trying to use the regular
>> expression
>> boost::regex("^SIGNAL ([A-Z0-9_]*?): STD_LOGIC_VECTOR\( ([0-9]*?) DOWNTO
>> ([0-9]*?)\);$");
>> to match strings like:
>> SIGNAL W07: STD_LOGIC_VECTOR( 255 DOWNTO 0);
>>
>> My problem is that matching those parantheses in the string doesn't work
>> - - I've tested the regexp under perl and it works fine but boost::regexp
>> and C++ works only if I remove the brackets from the string and from the
>> regular expression.
>>
>> Issues I observed& tested:
>> 1) '\(' ... '\)' no match but g++ complains about an unknown escape
>> sequence '\)'.
>> 1a) Using '\(' ... ')' doesn't match. (I would have expected an
>> exception to be thrown but none whatsoever.)
>> 2) '0x28' ... '0x29' doesn't match. (0x28 = ASCII '(', 0x29 = ASCII ')')
>> 3) '\Q(\E' ... '\Q)\E' no match but g++ complains about an unknown
>> escape sequence '\Q'.
>> 4) If I try to match only one of the brackets and remove the other from
>> the string& regex I get one of the following:
>> '\(' ... gives me:
>> terminate called after throwing an instance of
>> 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::regex_error>
>>
>>> '
>> what(): Found a closing ) with no corresponding openening
>> parenthesis. The error occured while parsing the regular expression
>> fragment: '?([0-9]*?)>>>HERE>>>);$'.
>> ... '\)' gives me:
>> terminate called after throwing an instance of
>> 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::regex_error>
>>
>>> '
>> what(): Unmatched marking parenthesis ( or \(. The error occured
>> while parsing the regular expression fragment: '[0-9]*?);$>>>HERE>>>'.
>> 5) Even
>> boost::regex("^SIGNAL W07: STD_LOGIC_VECTOR\( 255 DOWNTO 0\);$");
>> fails to match
>> SIGNAL W07: STD_LOGIC_VECTOR( 255 DOWNTO 0);
>>
>> Points 1a and 4 suggests that boost::regex treats even escaped
>> parantheses as special characters which is quite surprising to me. I am
>> sure a lot of people successfully used boost::regex to match parantheses
>> so I do wonder which part of the documentation I have missed.
>>
>> I'd be happy to provide more debugging information if you tell me how to
>> produce it.
>>
>> Many thanks,
>> Simon Hoerder
>>
>
> You need to escape the slash character for normal C++ processing, i.e.
> double slash.
> It should be "\\(" and "\\)" to catch a parentheses.
>

Works now, thanks. :-)

Cheers, Simon

- --
/***
  * Dipl. Ing. Simon Hoerder
  * Work: | Private:
  * Department of Computer Science | First Floor Flat
  * Merchant Venturers Building, 2.01 | 7 Whatley Road
  * Woodland Road |
  * Bristol, BS8 1UB | Bristol, BS8 2PS
  * United Kingdom | United Kingdom
  *
  * http://www.cs.bris.ac.uk/Research/CryptographySecurity/
  * UK mobile: +44 7564 035925
  * DE mobile: +49 179 7906117
  * Skype: aloisius_hingerl
  ***/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iFYEARELAAYFAk5SkdgACgkQE8ykjYCSVs5M+gDcDVbVT2kga464e0CVrlONzZSA
sHjcLVW5pSBKdgDgmXnvlJGq9AGJxuR4RTZi5CY0JMblmRgYA24xiw==
=WdbJ
-----END PGP SIGNATURE-----


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net