Boost logo

Boost Users :

From: John Maddock (john_maddock_at_[hidden])
Date: 2002-04-18 06:24:14


> Speaking as a satisfied user of the regular expression library, always
> looking to help make it better:
>
> I was under the impression that point (a) didn't cost anything in
> Boost::Regex because it was templatized on the character type. Am I
> mistaken? Case (b) is fairly rarely used, but (c) is common. In any
> event, it is certainly true that after compiling the regular
> expression, you know whether these are needed. So if there are faster
> algorithms for these special cases, could they be incorporated into the
> library without much overhead?

The point is that there are a wide range of differing state machine
representations available - to make "automatic" use of these one would have
to effectively implement several different regex state machines and switch
between them based on run time detection (what kind of expression you have),
this is a lot of work as well as adding code bloat. With respect to (a), it
is true that narrow character regexes make some optimisations now, but many
more are available - mainly in when in combination with (b) and (c).

> > C based libraries can also use alloca, which generally gives at least a
2x
> > performance increase.
>
> I know that alloca is not 'officially' available in portable C++. But I
> think most C++ compilers will handle C-like useages for this construct.
> I know we use it successfully on the compilers we use (gcc, Sun CC). So
> if there is someplace it would be useful, you could almost certainly
> get away with it, probably #ifdef'd around for safety.

Point taken, however it means a complete rewrite (and adds to the
maintenance a lot - more config options to test etc).

Personally I would rather see a separate regex type with limited usefulness,
but better performance when it can be used.

John Maddock
http://ourworld.compuserve.com/homepages/john_maddock/index.htm


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net