[Boost-bugs] [Boost C++ Libraries] #12076: A couple issues matching with unicode regular expressions (word delimiters, brackets)

Subject: [Boost-bugs] [Boost C++ Libraries] #12076: A couple issues matching with unicode regular expressions (word delimiters, brackets)
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2016-03-19 13:57:36


#12076: A couple issues matching with unicode regular expressions (word delimiters,
brackets)
------------------------------+-------------------------
 Reporter: anonymous | Owner: johnmaddock
     Type: Bugs | Status: new
Milestone: To Be Determined | Component: regex
  Version: Boost 1.61.0 | Severity: Problem
 Keywords: |
------------------------------+-------------------------
 Hi,

 The [https://github.com/mawww/kakoune/ kakoune] code editor uses boost-
 regex in order to search through a file using a regular expression, and
 I've stumbled upon some issues which I think are related to how boost
 handles unicode codepoints.

 The syntax used is the Perl one.

 First, the `\b` word delimiter doesn't seem to work when involving unicode
 characters, some strings that should be matched are not e.g. "abc” 123"
 with the pattern "”\b".

 Secondly, using the "." pattern on strings that contain unicode seems to
 select bytes, and not entire codepoints e.g. "”" with the pattern "." will
 select two bytes.

 Finally, using bracket around unicode characters does not work, for
 example "[”“]. This issue is probably related to the one above.

 I have had a look at the documentation, namely the
 [http://www.boost.org/doc/libs/1_60_0/libs/regex/doc/html/boost_regex/unicode.html
 Unicode & boost.regex] /
 [http://www.boost.org/doc/libs/1_60_0/libs/regex/doc/html/boost_regex/syntax/character_classes/optional_char_class_names.html
 Characters classes supported by Unicode regular expressions] pages, but
 I'm not sure if they are related to the issues above (please let me know
 if I missed something).

 Thanks.

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/12076>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:19 UTC