Subject: Re: [Boost-bugs] [Boost C++ Libraries] #3899: Regex: Bug in handling of "\Z"
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2010-02-05 12:50:30
#3899: Regex: Bug in handling of "\Z"
-------------------------------------------------+--------------------------
Reporter: Keith MacDonald <keith@â¦> | Owner: johnmaddock
Type: Bugs | Status: new
Milestone: Boost 1.43.0 | Component: None
Version: Boost 1.42.0 | Severity: Problem
Keywords: |
-------------------------------------------------+--------------------------
Comment(by johnmaddock):
There are two separate issues here:
1) Boost.Regex has always treated all line-termination characters as
equivalent, so for example $ will match before any line-termination
sequence: \n \r\n \r plus a few other unicode-specific sequences. This is
different to Perl's behaviour, but then Perl has complete control over
file IO and text file formats and line endings where as Boost.Regex does
not - and is intended to work with all text file formats wherever they're
from and however they're read in. This seems to have worked well in
practice up until now, and I don't really want to change it.
2) The behaviour of \Z in Perl seems to be quite "quirky" ;-) In fact
it's quite hard to write a regular expression that matches it's behaviour
exactly! From messing around it seems to be:
$(?=\n\z)|\z
where as Boost is doing:
$(?=\v+\z)|\z
This one I will look into changing, even though I would argue that the
current behaviour is often more useful :-)
John.
-- Ticket URL: <https://svn.boost.org/trac/boost/ticket/3899#comment:3> Boost C++ Libraries <http://www.boost.org/> Boost provides free peer-reviewed portable C++ source libraries.
This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:02 UTC