Boost logo

Boost :

From: Hakuro (hakuroum_at_[hidden])
Date: 2007-11-14 17:12:20


Hello Folks.
I found an issue of regex_search() with "/b" or "/B" assertion in
Win32 and created a patch(attached). Please review it.

Issue: /b, /B does not match with several word boundaries.
i.e.
regPat = /\bis\b/;
regPat.exec("This\u00C0is\u00C0bad"); //Does not match
regPat.exec("This$B$"(Bis$B$"(Bbad"); // Does not match

Desc:
In the ECMA262 regex spec(15.10.2.6), for "/b" and "/B" assertion only
charaters [0-9A-Za-z_] are allowed as a word character and other
should be treated as out of word.
In the boost:regex inpmenetation for Win32, GetStringTypeEx() with
C1_ALPHA | C1_DIGIT flags is used to determine char type.
However the API does not differentiate [0-9A-Za-z_] and other
characters (e.g. European characters, Kanji) (just linguistic
characters and everybody else)
which does not meet the spec.

Patch: Patch for w32_regex_traits.hpp, attached. Modified isctype() to
determine character type without the API use.

Regards.
Hak Matsuda
Lead Dev.
CRI Middleware inc.
340 Brannan St #400, San Francisco, CA 94107

-- 
Thanks!
HAK



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk