Boost logo

Boost :

From: Vladimir Pozdyayev (ardatur_at_[hidden])
Date: 2004-11-01 02:21:57


John,
I've studied the interfaces you suggested, and here are my
observations. Please correct me if I'm wrong.

<perl_matcher> is a collection of algorithms which sort of hack
into the underlying <basic_regex>'s internal data structures and
use them to perform matching or whatever they're up to.

<basic_regex_creator> is a "syntax features to internals"
converter which is called directly by the parser. In theory,
implementing it should be the better way to initialize customized
structures. However, the way it is used is somewhat tricky: it
fills not its own data structures, but structures of the class
that called the parser itself---the <basic_regex_implementation>.
So this one would need reimplementing, too.

Now for the real problem. Both <perl_matcher> and
<basic_regex_creator> deal with the already compiled state machine
or its elements. The first one works directly with the regex
internals, the second one gets <append_state> and similar calls
from the parser. The trouble is, some of my algorithms'
calculations have to be performed directly on the expression tree,
the compiled state machine won't help. Is it possible to restore
the tree from the information provided by the library? That is,
given the regex "((?:a|b)*?)(b+)", end up with an object like

    new cat(
        new match(
            new kleene_lazy(
                new alt(
                    new charset( "a" ),
                    new charset( "b" )
                )
            )
        ),
        new match(
            new repeat( new charset( "b" ) )
        )
    )

And now for something completely different.

The following program outputs ' aa', where the first char is \0.
If we replace <smatch> by <cmatch>, the output is ok. That holds
for the regex5 as well as the regex library in boost 1.31.0. Am I
missing something? (MSVC 7.0)

#include <boost/regex.hpp>
#include <iostream>
main() {
    boost::smatch m;
    boost::regex_match( "aaa", m, boost::regex( ".*" ) );
    std::cout << m[ 0 ] << "\n";
}

-- 
Best regards,
Vladimir Pozdyayev.

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk