Boost logo

Boost :

From: Moore, Paul (Paul.Moore_at_[hidden])
Date: 1999-09-06 03:48:23


From: Greg Colvin [mailto:gcolvin_at_[hidden]]

> Both NFA and DFA approaches have advantages. As I recall,
> and it's been a while, a DFA uses time linear in the amount
> of text to search, but potentially exponential space to store
> the DFA, whereas the NFA uses space linear in the size of
> the search pattern, but potentially exponential time. I've
> used NFAs for small query parsers, and DFAs for searching
> across big collections of text.
>

My view is that the significant issue here is one of design. There are lots
of regular expression packages out there (most for C, one or two for C++),
with varying syntax (Posix, Perl, Grep, Egrep, ...) and implementation
details (DFA, NFA, hybrid, ...). For me, the key issue has been finding
something with a natural C++ syntax, which fits cleanly with the standard
library.

John's implementation is the first I have seen which makes a serious attempt
at that.

I've not looked too closely at the details yet, but it would be good if the
traits approach could be extended to allow for "drop in" alternative regular
expression syntax and/or implementations. For a Java regex package which has
this feature (by means of compiler and matcher "interfaces", and multiple
concrete implementations), see
http://www.quateams.com/oro/software/OROMatcher1.1.html.

> I very much welcome John's participation -- what is most
> important is a clean syntax -- the semantics can be fixed
> to be whatever we need them to be.

To my mind, regular expressions an example of something which is

a) Substantial
b) Commonly available
c) Relatively easy to keep portable (as opposed to say GUI classes)

and which would benefit hugely from a boost-sponsored reference
implementation.

If we can produce a de-facto standard implementation, that's one less wheel
for people to reinvent. [That's the motivation for my idea of plug-in syntax
and matchers - those areas are the two places where "religious" differences
rear their ugly heads, so allowing people to make their own versions within
the framework seems like a good idea - a bit like the STL (lack of) hash
maps...]

But overall, I also welcome John's participation.

Paul.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk