Subject: Re: [Boost-bugs] [Boost C++ Libraries] #1201: Regexify the syntax highlighter
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2007-08-26 09:37:27
#1201: Regexify the syntax highlighter
-------------------------------+--------------------------------------------
Reporter: djowel | Owner: mumiee
Type: Feature Requests | Status: assigned
Milestone: To Be Determined | Component: quickbook
Version: Boost 1.34.1 | Severity: Problem
Resolution: | Keywords: documentat ibd boost-doc regular expression lexer regex xpressive quickbook lexer syntax highlighting
-------------------------------+--------------------------------------------
Comment (by mumiee):
Because the above is so terribly formated:[[BR]]
Syntax idea so far:[[BR]]
[sourcemode <NAME_OF_MODE> <' '-separated_LIST_OF_MODES>[[BR]]
<rulename> [REGEX] <OPTIONAL_TEMPLATE_TO_INVOKE_WITH_MATCH>[[BR]]
][[BR]]
[[BR]]
The first line defines the start rule, every occurance of a [[BR]]
rulename inside the regular expressions will treated as a [[BR]]
reference to the regex attached to that rulename. Hence[[BR]]
recursion is possible. [[BR]]
[[BR]]
because [] are very common regex characters, we mitght switch to:[[BR]]
<rulename> "REGEX" <TEMPLATE_TO_INVOKE_WITH_MATCH>[[BR]]
[[BR]]
We will probably use xpressive, because it already allows recursion[[BR]]
and has a parser for strings. We would prefer spirit, if there was a[[BR]]
"dynamic" spirit, since ebnfs with operator- and eps_p are easier to use
[[BR]]
than lookahead or lookbehind assertions.[[BR]]
[[BR]]
The ' ' separated list of modes should allow reusing existing source
[[BR]]
mode definitions. We might prefix rules of imported regex... [[BR]]
An untested and incomplete C++ grammar could look like that:[[BR]]
[sourcemode cpp [[BR]]
program
"(comment|preprocessor|keyword|identifier|special|string|char|number|.)*"[[BR]]
comment "(//[NOT\n]*|/\*.*?\*/)" add_comment_markup[[BR]]
preprocessor "#\s[NOT\n]*" add_preproc_markup[[BR]]
keyword "(auto|bool|char|...)(?!\w)" add_keyword_markup[[BR]]
keyword "(auto|and|and_eq|bool|char|...)(?!\w)" add_keyword_markup[[BR]]
special "[\~!%&\*()+={\[}\]:;,<\.>?/\|\-]+" add_special_markup[[BR]]
string "[lL]?\"([NOT\"]|\")*?\"" add_string_markup[[BR]]
char "[lL]?'([NOT']?)'" add_char_markup[[BR]]
number .....[[BR]]
][[BR]]
[[BR]]
Stuff to decide:[[BR]]
1) What if the regex defines marks, and grups submatches and so on, should
[[BR]]
every submatch become a parameter to the template. Shall we then omit the
[[BR]]
complete match from the parameter list. Or shall we always first submit
the[[BR]]
compelete match then the first to nth submatch, as a parameter...?[[BR]]
2) Should we implement a kind of binder syntax like in boost.bind, for the
[[BR]]
various matches? That way we would add a kind of substitution like
functionalty.[[BR]]
rule "\(#\sdefine \)[NOT\n]*" [extendedn_preproc_markup _1.. macro
contents are
secrets][[BR]][[BR]]
So "#define PI 3.14126.." would turn into a highlighted:[[BR]]
"#define macro contents are secrets"[[BR]][[BR]]
Development currently takes place at:[[BR]]
http://svn.boost.org/svn/boost/branches/xpressive/nested_dynamic_regex/[[BR]]
--
Ticket URL: <http://svn.boost.org/trac/boost/ticket/1201#comment:2>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.
This archive was generated by hypermail 2.1.7 : 2017-02-16 18:49:56 UTC