From: David Abrahams (dave_at_[hidden])
Date: 2002-10-24 11:38:40
"Andrei Alexandrescu" <andrewalex_at_[hidden]> writes:
> > Finally there is one problem here: because regexes are interpreted
> > strings, you pay for every feature you might use, not for those
> > that you actually do use. User feedback up until now has been
> > driving towards more features not less, on the other hand you
> > could probably put together a very basic toy-regex implementation
> > in about 10K, IMO it's use would be very limited though.
> Not what I think. If you're like me, most of the time you have simple
> parsing needs. Heck, I've parsed my tax data with a couple of simple rules.
> True, many will have more complex needs. The point is that users should pay
> for what they need, not the whole thing at once. This could be achieved by
> compiling versions with limited feature sets, even though the regex is
> interpreted. As things stand right now, the two mamooth regexp libraries are
> a perfect motivator for regular programmers (heh, that's a pun) to roll
> their own.
Now wait a second. If you try to roll your own regular expression
interpreter, you're almost certainly going to either get it wrong or
make something much bigger than the implementations John and Eric have
been refining for years, or both. If you just need a "simpler parser",
then maybe you /should/ be using different tools. Anyone who has
studied regexps for 5 minutes knows they're not exactly "simple", and
are in fact inappropriate for lots of structural parsing jobs (even
such things as parsing simple arithmetic expressions). Regexps are
really aimed at a different kind of textual analysis, and provide a
great deal of power in the domain in which they're appropriate.
However, there are lots of reasons to choose regular expressions
even when their power is more than you need:
a. You may already be familar with the use and meaning of regexps
from the many other contexts where they appear.
b. (related) you may have regular expressions lying around from
some other job which recognize the things you care about.
c. Somebody already rolled this library for you and you don't care
that much about the size.
In particular, if I was parsing my tax data, and if regexps were up to
the job, I'd pick the library first.
-- David Abrahams dave_at_[hidden] * http://www.boost-consulting.com Building C/C++ Extensions for Python: Dec 9-11, Austin, TX http://www.enthought.com/training/building_extensions.html
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk