Subject: Re: [Boost-bugs] [Boost C++ Libraries] #1201: Regexify the syntax highlighter
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2007-08-26 09:31:38
#1201: Regexify the syntax highlighter
-------------------------------+--------------------------------------------
Reporter: djowel | Owner: mumiee
Type: Feature Requests | Status: assigned
Milestone: To Be Determined | Component: quickbook
Version: Boost 1.34.1 | Severity: Problem
Resolution: | Keywords: documentat ibd boost-doc regular expression lexer regex xpressive quickbook lexer syntax highlighting
-------------------------------+--------------------------------------------
Changes (by mumiee):
* keywords: => documentat ibd boost-doc regular expression lexer regex
xpressive quickbook lexer syntax highlighting
* owner: djowel => mumiee
* status: new => assigned
Comment:
Syntax idea so far:
[sourcemode <NAME_OF_MODE> <' '-separated_LIST_OF_MODES>
<rulename> [REGEX] <OPTIONAL_TEMPLATE_TO_INVOKE_WITH_MATCH>
]
The first line defines the start rule, every occurance of a
rulename inside the regular expressions will treated as a
reference to the regex attached to that rulename. Hence
reccursion is possible.
because [] are very common regex characters, we mitght switch to:
<rulename> "REGEX" <TEMPLATE_TO_INVOKE_WITH_MATCH>
We will probably use xpressive, because it already allows recursion
and has a parser for strings. We would prefer spirit, if there was a
"dynamic" spirit, since ebnfs with operator- and eps_p are easier to use
than lookahead or lookbehind assertions.
The ' ' separated list of modes should allow reusing existing source
mode definitions. We might prefix rules of imported regex...
An untested and incomplete C++ grammar could look like that:
[sourcemode cpp
program
"(comment|preprocessor|keyword|identifier|special|string|char|number|.)*"
comment "(//[^\n]*|/\*.*?\*/)" add_comment_markup
preprocessor "#\s[^\n]*" add_preproc_markup
keyword "(auto|bool|char|...)(?!\w)" add_keyword_markup
keyword "(auto|and|and_eq|bool|char|...)(?!\w)" add_keyword_markup
special "[\~!%^&\*()+={\[}\]:;,<\.>?/\|\-]+" add_special_markup
string "[lL]?\"([^\"]|\")*?\"" add_string_markup
char "[lL]?'([^']?)'" add_char_markup
number .....
]
Stuff to decide:
1) What if the regex defines marks, and grups submatches and so on, should
every submatch become a parameter to the template. Shall we then omit the
complete match from the parameter list. Or shall we always first submit
the
compelete match then the first to nth submatch, as a parameter...?
2) Should we implement a kind of binder syntax like in boost.bind, for the
various matches? That way we would add a kind of substitution like
functionalty.
rule "\(#\sdefine \)[^\n]*" [extendedn_preproc_markup _1.. macro contents
are secrets]
So "#define PI 3.14126.." would turn into a highlighted:
"#define macro contents are secrets"
Development currently takes place at:
http://svn.boost.org/svn/boost/branches/xpressive/nested_dynamic_regex/
--
Ticket URL: <http://svn.boost.org/trac/boost/ticket/1201#comment:1>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.
This archive was generated by hypermail 2.1.7 : 2017-02-16 18:49:56 UTC