[Boost-bugs] [Boost C++ Libraries] #11205: Add support for perl (*VERB) directives

Subject: [Boost-bugs] [Boost C++ Libraries] #11205: Add support for perl (*VERB) directives
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2015-04-19 10:51:17


#11205: Add support for perl (*VERB) directives
------------------------------+-------------------------
 Reporter: johnmaddock | Owner: johnmaddock
     Type: Feature Requests | Status: new
Milestone: To Be Determined | Component: regex
  Version: Boost 1.58.0 | Severity: Problem
 Keywords: |
------------------------------+-------------------------
 I'd like to see support for Perl's (*SKIP) regex verb in Boost.

 There are a number of verbs, but that one has an interesting and
 frequent use case: it enables searching for an expression but only
 outside of some contexts.

 There is a page called "The best regex trick" with details about the
 process and a number of examples. I can't link to it as this is my first
 message (I attempted before but it was rejected). It explains how to do
 it with and without (*SKIP). I've seen several questions in Stack
 Overflow asking how to accomplish that task, so it seems it's quite
 frequent to run into that need.

 Let's say for example that we want to find the string 'foo' as an
 identifier in C. This is a crude example of a Perl regex that does it (a
 real one might need to be more elaborate; in particular, backslashes for
 line continuation are not considered):

   (?x-s) (?# free spacing, dot doesn't match newline)
   (?://.*+ (?# eat single-line comment text)
     |/\*[\S\s]*?\*/ (?# eat multi-line comment text)
     |"(?:\\.|[^"\n])*+" (?# eat string text)
   )(*SKIP)(?!) (?# skip these)
   |\bfoo\b (?# match this)

 regex::search will match that expression only when foo is present
 outside of a string or comment.

 Without (*SKIP), it can be done only by calling regex::search multiple
 times, using an expression like this:

   (?-s)//.*+|/\*[\S\s]*?\*/|"(?:\\.|[^"\n])*+"|(\bfoo\b)

 and ignoring every match where group 1 wasn't matched. That's presumed
 to be slower, and certainly more inconvenient for the programmer.

 Support for this particular use case would be a great feature to have in
 the regex engine.

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/11205>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:18 UTC