Re: [Boost-bugs] [Boost C++ Libraries] #4721: multiple capture groups with the same name break regex

Subject: Re: [Boost-bugs] [Boost C++ Libraries] #4721: multiple capture groups with the same name break regex
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2010-10-13 09:20:27


#4721: multiple capture groups with the same name break regex
-------------------------------------+--------------------------------------
  Reporter: robin.snyder@… | Owner: johnmaddock
      Type: Bugs | Status: new
 Milestone: To Be Determined | Component: regex
   Version: Boost 1.44.0 | Severity: Problem
Resolution: | Keywords: regex named capture group
-------------------------------------+--------------------------------------

Comment (by anonymous):

 Still testing the revised code, but I believe your examples work now, for
 the new one I get:


 {{{
 The following match was found for text 11spa 12345 67890
 $0 = "11spa 12345 67890"
 $1 = "11spa 12345 67890"
 $2 = "11"
 $3 = "11"
 $4 = "s"
 $5 = "s"
 $6 = "pa"
 $7 = "12345 67890"
 $8 = "12345 67890"
 $9 = "12345"
 $10 = "67890"
 $11 = ""
 $12 = ""
 $13 = ""
 $14 = ""
 $15 = ""
 $16 = ""
 $17 = ""
 MPAT01 = 11spa 12345 67890
 MPAT01.zone = 11
 MPAT01.band = s
 MPAT01.grid = pa
 MPAT01.easting = 12345
 MPAT01.northing = 67890

 The following match was found for text 11spa 1234 6789
 $0 = "11spa 1234 6789"
 $1 = "11spa 1234 6789"
 $2 = "11"
 $3 = "11"
 $4 = "s"
 $5 = "s"
 $6 = "pa"
 $7 = "1234 6789"
 $8 = ""
 $9 = ""
 $10 = ""
 $11 = "1234 6789"
 $12 = "1234"
 $13 = "6789"
 $14 = ""
 $15 = ""
 $16 = ""
 $17 = ""
 MPAT01 = 11spa 1234 6789
 MPAT01.zone = 11
 MPAT01.band = s
 MPAT01.grid = pa
 MPAT01.easting = 1234
 MPAT01.northing = 6789

 The following match was found for text 11spa 123 678
 $0 = "11spa 123 678"
 $1 = "11spa 123 678"
 $2 = "11"
 $3 = "11"
 $4 = "s"
 $5 = "s"
 $6 = "pa"
 $7 = "123 678"
 $8 = ""
 $9 = ""
 $10 = ""
 $11 = ""
 $12 = ""
 $13 = ""
 $14 = "123 678"
 $15 = "123"
 $16 = "678"
 $17 = ""
 MPAT01 = 11spa 123 678
 MPAT01.zone = 11
 MPAT01.band = s
 MPAT01.grid = pa
 MPAT01.easting = 123
 MPAT01.northing = 678
 }}}

 Which I believe was what you were hoping for?

 Note that the way in which named sub-expressions get numbered differs
 between Perl and .NET. They also differ in how they treat multiple named
 subs with the same name - in .NET they are treated as the same named
 capture group. In Perl they are separate groups (with different numbers)
 that happen to have the same name - so $+{name} returns the leftmost
 capture group called "name" that matched. As long as only one of the
 identically named captures can match at a time then the two approaches are
 the same; other than for the numbers assigned to the capture groups.
 However, if more than one capture with a given name can match at a time,
 then it is possible to tell the difference between them, for example:

 (?<A>a)(?<A>b) against "ab"

 will result in $+{A} being "a" for Perl "b" for .NET (at least I think
 that's what .NET does!!).

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/4721#comment:4>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:04 UTC