From: John Maddock (John_Maddock_at_[hidden])
Date: 1999-09-14 06:01:23
Proposal for modifying regex++ regular
expression library for compatibility with
1) Immediate changes:
a) change #include <header> to #include
b) change #include <jm/header> to
c) reorganise directories to reflect (a)
d) change header extensions to .hpp.
e) change #defines from JM_MACRO to
BOOSTRE_MACRO & depreciate macros intended
for older compilers and STL versions to
reduce configuration overhead.
f) Change namespace from jm to boost::re,
import entry point classes and algorithms
into namespace boost.
g) Change expression compilation to
support dfa style searching for those
expressions that are capable of it, this
will enable the library to give worst-case
performance guarantees in these cases, and
hopefully address the reservations
expressed on the mailing list.
h) Adjust sub-expression matching to
follow "leftmost longest", currently while
the overall expression does follow this,
the sub-expressions match only by "longest"
and not "leftmost", this can lead to
ambiguity in some cases.
i) Tidy up documentation to meet boost
2) Medium term changes:
a) Rewrite and tidy up the "backend"
traits class implementation - fix the
support for std::locale to give equivalent
performance to the global local versions -
and make that the default version if the
platform can support it. Eliminate multiple
build modes for differing localisation
models and replace with a separate traits
class for each of the current modes.
b) Document traits class once interface
c) Depreciate some of the library's
internals, which already have boost or
standard library equivalents widely
available. Possibly surface and document
other parts of the library's internals -
for example the Knuth Morris Pratt code.
3) In parallel and/or longer term:
a) Define a design idiom for generic
"pattern" types, rather as containers have
certain basic requirements that all
containers should satisfy, so "compiled
patterns" should have certain requirements
that all pattern matching classes should
meet. Just as some containers may be
interchangeable under some circumstances,
the same should be true for pattern-
matching classes. For example literal
string algorithms (KNP, Boyer-Moore, Shift
Or etc) should be more or less
interchangeable with each other (cf vector
and deque), but not with say a regular
expression (cf vector and set).
b) Make boost regular expressions conform
c) Add additional pattern matching
The list above may not be exhaustive (!),
but are the main points that come to mind.
Ideally all of (1) and (2) should be
carried out at the same time - and in
practice there may be some overlap, the
separation above represents my view of
priorities, but may not be the only one.
(3a) also feeds into (1) and (2), so I may
try to produce an initial draft for member
comment sooner rather than later if there
is interest. Probably I will set a time
limit on (1) and try to produce at least
an "experimental" port in that time.
Anyway thanks for the feedback,
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk