Boost logo

Boost :

From: John Maddock (John_Maddock_at_[hidden])
Date: 1999-09-14 06:01:23


Proposal for modifying regex++ regular
expression library for compatibility with
boost.

1) Immediate changes:
     a) change #include <header> to #include
       <boost/header>
     b) change #include <jm/header> to
       #include <boost/re/header>
     c) reorganise directories to reflect (a)
       and (b).
     d) change header extensions to .hpp.
     e) change #defines from JM_MACRO to
       BOOSTRE_MACRO & depreciate macros intended
       for older compilers and STL versions to
       reduce configuration overhead.
     f) Change namespace from jm to boost::re,
       import entry point classes and algorithms
       into namespace boost.
     g) Change expression compilation to
       support dfa style searching for those
       expressions that are capable of it, this
       will enable the library to give worst-case
       performance guarantees in these cases, and
       hopefully address the reservations
       expressed on the mailing list.
     h) Adjust sub-expression matching to
       follow "leftmost longest", currently while
       the overall expression does follow this,
       the sub-expressions match only by "longest"
       and not "leftmost", this can lead to
       ambiguity in some cases.
     i) Tidy up documentation to meet boost
       guidelines.
2) Medium term changes:
     a) Rewrite and tidy up the "backend"
       traits class implementation - fix the
       support for std::locale to give equivalent
       performance to the global local versions -
       and make that the default version if the
       platform can support it. Eliminate multiple
       build modes for differing localisation
       models and replace with a separate traits
       class for each of the current modes.
     b) Document traits class once interface
       is stable.
     c) Depreciate some of the library's
       internals, which already have boost or
       standard library equivalents widely
       available. Possibly surface and document
       other parts of the library's internals -
       for example the Knuth Morris Pratt code.
3) In parallel and/or longer term:
     a) Define a design idiom for generic
       "pattern" types, rather as containers have
       certain basic requirements that all
       containers should satisfy, so "compiled
       patterns" should have certain requirements
       that all pattern matching classes should
       meet. Just as some containers may be
       interchangeable under some circumstances,
       the same should be true for pattern-
       matching classes. For example literal
       string algorithms (KNP, Boyer-Moore, Shift
       Or etc) should be more or less
       interchangeable with each other (cf vector
       and deque), but not with say a regular
       expression (cf vector and set).
     b) Make boost regular expressions conform
       to 3a.
     c) Add additional pattern matching
       primitives.

The list above may not be exhaustive (!),
but are the main points that come to mind.
Ideally all of (1) and (2) should be
carried out at the same time - and in
practice there may be some overlap, the
separation above represents my view of
priorities, but may not be the only one.
(3a) also feeds into (1) and (2), so I may
try to produce an initial draft for member
comment sooner rather than later if there
is interest. Probably I will set a time
limit on (1) and try to produce at least
an "experimental" port in that time.

Anyway thanks for the feedback,

Best regards,

John.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk