Boost logo

Boost Users :

Subject: [Boost-users] Problems matching parantheses in a string using boost::regex 1.42.0 on a amd64 debian system
From: Simon Hoerder (simon_at_[hidden])
Date: 2011-08-22 12:56:55


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA224

Hi,

I'm fairly new to boost libraries but searched the documentation and
googled for an answer to my problem but couldn't come up with something.
Please bear with me.

I am using boost::regex (version 1.42.0 on debian amd64 machine) to
parse a vhdl-like file. To this end, I am trying to use the regular
expression
boost::regex("^SIGNAL ([A-Z0-9_]*?): STD_LOGIC_VECTOR\( ([0-9]*?) DOWNTO
([0-9]*?)\);$");
to match strings like:
SIGNAL W07: STD_LOGIC_VECTOR( 255 DOWNTO 0);

My problem is that matching those parantheses in the string doesn't work
- - I've tested the regexp under perl and it works fine but boost::regexp
and C++ works only if I remove the brackets from the string and from the
regular expression.

Issues I observed & tested:
1) '\(' ... '\)' no match but g++ complains about an unknown escape
   sequence '\)'.
1a) Using '\(' ... ')' doesn't match. (I would have expected an
    exception to be thrown but none whatsoever.)
2) '0x28' ... '0x29' doesn't match. (0x28 = ASCII '(', 0x29 = ASCII ')')
3) '\Q(\E' ... '\Q)\E' no match but g++ complains about an unknown
   escape sequence '\Q'.
4) If I try to match only one of the brackets and remove the other from
   the string & regex I get one of the following:
   '\(' ... gives me:
terminate called after throwing an instance of
'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::regex_error>
>'
  what(): Found a closing ) with no corresponding openening
parenthesis. The error occured while parsing the regular expression
fragment: '?([0-9]*?)>>>HERE>>>);$'.
   ... '\)' gives me:
terminate called after throwing an instance of
'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::regex_error>
>'
  what(): Unmatched marking parenthesis ( or \(. The error occured
while parsing the regular expression fragment: '[0-9]*?);$>>>HERE>>>'.
5) Even
   boost::regex("^SIGNAL W07: STD_LOGIC_VECTOR\( 255 DOWNTO 0\);$");
   fails to match
   SIGNAL W07: STD_LOGIC_VECTOR( 255 DOWNTO 0);

Points 1a and 4 suggests that boost::regex treats even escaped
parantheses as special characters which is quite surprising to me. I am
sure a lot of people successfully used boost::regex to match parantheses
so I do wonder which part of the documentation I have missed.

I'd be happy to provide more debugging information if you tell me how to
produce it.

Many thanks,
Simon Hoerder

- --
/***
  * Dipl. Ing. Simon Hoerder
  * Work: | Private:
  * Department of Computer Science | First Floor Flat
  * Merchant Venturers Building, 2.01 | 7 Whatley Road
  * Woodland Road |
  * Bristol, BS8 1UB | Bristol, BS8 2PS
  * United Kingdom | United Kingdom
  *
  * http://www.cs.bris.ac.uk/Research/CryptographySecurity/
  * UK mobile: +44 7564 035925
  * DE mobile: +49 179 7906117
  * Skype: aloisius_hingerl
  ***/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iFYEARELAAYFAk5SilEACgkQE8ykjYCSVs7wGQDgmE795i+lC/qNHlwYVQvlxZtm
SyiB7JreAuxQogDfUY0VLkfvNOcj9q41/43U5D8WxwljzL71+TkW2g==
=wN0G
-----END PGP SIGNATURE-----


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net