Boost logo

Boost :

From: Eric Niebler (eric_at_[hidden])
Date: 2005-09-18 21:05:03


Eric Niebler wrote:
> John Maddock wrote:
>
>>there is one complicated
>>expression that xpressive didn't compile, but which Boost.Regex and PCRE
>>did handle OK.
>
>
>
> I've tracked this down. xpressive is rejecting this regex because it is
> invalid according to the TR1 spec. It begins:
>
> const char* highlight_expression =
> "(^[ \t]*#(?:[^\\\\\\n]|"
> "\\\\[^\\n_[:punct:][:alnum:]]*[\\n[:punct:][:word:]])*)|"
> ----------------------------------------------^^^^^^^^
>
> The problem is [:word:]. "word" is not a valid char-class-name,
> according to the TR1 spec:
>

I've dug a bit deeper and I've found other problems with this regex.
First, it uses "\\<" and "\\>" as begin- and end-of-word assertions.
This is not recognized ECMA syntax, so xpressive treats them as
literals. Also, you seem to be assuming that "\\n" is treated as a
newline. Right now, xpressive does not recognize "\\n" as a newline
unless you pass the "normalize" syntax_option_type flag, in which case,
it will also recognize "\\a", "\\f", "\\r", "\\t" and "\\v". If you
don't want to bother with the "normalize" flag, you should use "\n" in
the pattern instead of "\\n".

Thinking ...

In looking over the ECMAScript syntax, I'm thinking this may be a bug in
xpressive. I should probably do away with the "normalize" flag and
always recognize "\\n" as a newline literal.

Not sure what to do about "\\<" and "\\>", though.

-- 
Eric Niebler
Boost Consulting
www.boost-consulting.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk