[Boost-bugs] [Boost C++ Libraries] #12619: Boost.Regex partial_match fails (see also Ticket #11776 feature request)

Subject: [Boost-bugs] [Boost C++ Libraries] #12619: Boost.Regex partial_match fails (see also Ticket #11776 feature request)
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2016-11-23 17:15:14


#12619: Boost.Regex partial_match fails (see also Ticket #11776 feature request)
------------------------------------------------+-------------------------
 Reporter: Dr. Robert van Engelen <engelen@…> | Owner: johnmaddock
     Type: Bugs | Status: new
Milestone: To Be Determined | Component: regex
  Version: Boost 1.61.0 | Severity: Problem
 Keywords: partial_match |
------------------------------------------------+-------------------------
 Boost.Regex is a great library that we use extensively. I am re-raising
 Ticket 11776 as a bug. The `partial_match` implementation is broken
 because regex repetitions (*, +) may behave lazy or greedy depending on
 input text buffer size. This is very unfortunate, because `partial_match`
 provides the '''only''' possible mechanism to search streaming input text
 without buffering the entire text. To restrict the regex to simple forms
 that do not include repetitions (*, +) is not a viable workaround. There
 are use cases in which we must take interactive input (i.e. buffering one
 char at a time) or take large files in which the pattern searched may not
 fit in the current buffer allocated, thus not producing the longest match,
 and worse we don't know if the buffer must be enlarged to continue
 iterating to find the longest match.

 The correct `partial_match` algorithm should consider that '''as long as
 backtracking on a repetition pattern in the regex is still possible given
 some partial input text, Boost.Regex should flag the result as a partial
 match instead of a full match.'''. With this change, matching "`abc.*123`"
 may require the whole input, but in this case that is OK! We need this
 flexibility of the matcher with a buffering approach.

 Unfortunately, the suggested workaround by the Boost.Regex documentation
 to check if the pattern matched the input up to the buffer end (which
 indicates a partial match) does not always work.

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/12619>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:20 UTC