Boost logo

Boost Users :

From: John Maddock (john_at_[hidden])
Date: 2007-01-11 12:59:49


Detlef Meyer-Eltz wrote:
> Now you can imagine, that it is a shock for me, to discover, that I
> misinterpreted the leftmost longest rule in the manner I liked. I
> didn't stumble over this error, because the matching of two
> alternatives with the same length seems to be a rare case.

Nod.

> void add_symbol(const charT* p, symbol_type s);
>
> I don't know, whether there is a chance to write code for such an
> addition to a lexregex which is already compiled. Otherwise such a
> lexregex had to be compiled in an extra step before use. In this form
> I could make it on top of the existing regex class.

I think you would have to recompile the whole regex in order to add an
arbitrary extension to the expression.

> In this context there are two other points I'm interested in:
>
> In my parser generator there is a preference for literal tokens
> already (they aren't treated as regular expressions but by a ternary
> tree), and I have a vague idea, that generally a token should be
> preferred the more, the more literally it is. In your documentation
> you mention some experimental non-member comparison operators. What is
> the idea behind these comparisions? Could they be used, to define
> preferences?

The comparison operators aren't used anywhere to determine matching: they
can be used by the user to compare the result to a specific string for
example.

> I guess, that testing one token after the other would be much more
> expensive, than testing them together. All the more as there is a
> special feature of my parser generator not only to test for tokens at
> the actual location in the input as to look for the next location,
> where one of several tokens occur. Can you tell me something about
> these differences of costs?

It's likely to be more expensive yes: "impossible" branches in the state
machine get eliminated quite quickly in the regex internals during matching,
where as the "one expression at a time" approach necessarily tests all the
candidates. The difference would depend very much upon the particular
expressions though.

HTH, John.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net