Re: [Boost-bugs] [Boost C++ Libraries] #1201: Regexify the syntax highlighter

Subject: Re: [Boost-bugs] [Boost C++ Libraries] #1201: Regexify the syntax highlighter
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2007-08-26 09:31:38


#1201: Regexify the syntax highlighter
-------------------------------+--------------------------------------------
  Reporter: djowel | Owner: mumiee
      Type: Feature Requests | Status: assigned
 Milestone: To Be Determined | Component: quickbook
   Version: Boost 1.34.1 | Severity: Problem
Resolution: | Keywords: documentat ibd boost-doc regular expression lexer regex xpressive quickbook lexer syntax highlighting
-------------------------------+--------------------------------------------
Changes (by mumiee):

  * keywords: => documentat ibd boost-doc regular expression lexer regex
               xpressive quickbook lexer syntax highlighting
  * owner: djowel => mumiee
  * status: new => assigned

Comment:

 Syntax idea so far:
 [sourcemode <NAME_OF_MODE> <' '-separated_LIST_OF_MODES>
 <rulename> [REGEX] <OPTIONAL_TEMPLATE_TO_INVOKE_WITH_MATCH>
 ]

 The first line defines the start rule, every occurance of a
 rulename inside the regular expressions will treated as a
 reference to the regex attached to that rulename. Hence
 reccursion is possible.

 because [] are very common regex characters, we mitght switch to:
 <rulename> "REGEX" <TEMPLATE_TO_INVOKE_WITH_MATCH>

 We will probably use xpressive, because it already allows recursion
 and has a parser for strings. We would prefer spirit, if there was a
 "dynamic" spirit, since ebnfs with operator- and eps_p are easier to use
 than lookahead or lookbehind assertions.

 The ' ' separated list of modes should allow reusing existing source
 mode definitions. We might prefix rules of imported regex...

 An untested and incomplete C++ grammar could look like that:
 [sourcemode cpp
 program
 "(comment|preprocessor|keyword|identifier|special|string|char|number|.)*"
 comment "(//[^\n]*|/\*.*?\*/)" add_comment_markup
 preprocessor "#\s[^\n]*" add_preproc_markup
 keyword "(auto|bool|char|...)(?!\w)" add_keyword_markup
 keyword "(auto|and|and_eq|bool|char|...)(?!\w)" add_keyword_markup
 special "[\~!%^&\*()+={\[}\]:;,<\.>?/\|\-]+" add_special_markup
 string "[lL]?\"([^\"]|\")*?\"" add_string_markup
 char "[lL]?'([^']?)'" add_char_markup
 number .....
 ]

 Stuff to decide:
 1) What if the regex defines marks, and grups submatches and so on, should
 every submatch become a parameter to the template. Shall we then omit the
 complete match from the parameter list. Or shall we always first submit
 the
 compelete match then the first to nth submatch, as a parameter...?
 2) Should we implement a kind of binder syntax like in boost.bind, for the
 various matches? That way we would add a kind of substitution like
 functionalty.
 rule "\(#\sdefine \)[^\n]*" [extendedn_preproc_markup _1.. macro contents
 are secrets]

 So "#define PI 3.14126.." would turn into a highlighted:
 "#define macro contents are secrets"

 Development currently takes place at:
 http://svn.boost.org/svn/boost/branches/xpressive/nested_dynamic_regex/

--
Ticket URL: <http://svn.boost.org/trac/boost/ticket/1201#comment:1>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.


This archive was generated by hypermail 2.1.7 : 2017-02-16 18:49:56 UTC