Boost logo

Boost :

From: Christopher Kohlhoff (chris_at_[hidden])
Date: 2005-09-15 01:43:03


Hi Eric,

--- Eric Niebler <eric_at_[hidden]> wrote:
> What you are describing, at least in regex terms, is a partial match.
> Ordinarily, a regex match will give you a "yes" or a "no" answer.
> With a
> partial match, you can get a "maybe" if the input sequence is
> exhausted
> before the regex state machine has reached its final state.

I was aware of partial matches from perusing the documentation a while
back, and I'm not sure that it's exactly the same thing -- please
correct me if i'm wrong.

>From my reading of the documentation, if I get a partial match and I
want to continue to try for a full match I must buffer the entire data
from the beginning of the partial match. This means that in the
partial_regex_grep example it cannot find a substring match that is
greater then 4096 characters long, because that is all the data it will
buffer.

Furthermore, each time I want to retest the input against the
expression it must process the whole input string again. This is not
ideal from an efficiency point of view, since I could potentially
receive input data one byte at a time.

What I want is a stateful regular expression-based decoder object
(since in theory it's just a state machine and can remember its current
state). I can feed it more input which will cause more state
transitions, and it will tell me when it reaches a terminal state. I
never have to buffer more input than the block just read because
earlier input will have been fully consumed by the decoder.

As I said, this is an area I am very interested in exploring further
when time permits (and not just in relation to regular expressions, but
also things like Boost.Serialization), but that definitely belongs in
its own thread.

Cheers,
Chris


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk