Boost logo

Boost Users :

Subject: Re: [Boost-users] [Spirit.Qi] failing to parse IP address octets
From: Michael Powell (mwpowellhtx_at_[hidden])
Date: 2017-11-27 23:03:00


Just taking a 20k meter view of things, this is looking to me like a
bug in one or more of the following, or some Skipper dynamic I'm
unaware of:

* optional (-)
* alternative (|)

A bug in one or more of these, or others, piling up on each other.

Because, to me at least, the grammar looks fine, as far as I am aware,
with the Spirit Qi operators I know about.

Can anyone at all look into this?

Thank you...

On Mon, Nov 27, 2017 at 5:48 PM, Michael Powell <mwpowellhtx_at_[hidden]> wrote:
> On Mon, Nov 27, 2017 at 5:04 PM, Michael Powell <mwpowellhtx_at_[hidden]> wrote:
>> On Mon, Nov 27, 2017 at 4:39 PM, Michael Powell <mwpowellhtx_at_[hidden]> wrote:
>>> Hello,
>>>
>>> I've got the following parsers which are failing to find the correct attributes.
>>>
>>> I tried to parse in terms of uint_parser<uint8_t, 10, 1, 3>() at
>>> first, but this was not working properly.
>>>
>>> So I tried the following:
>>>
>>> _octet_rule = lexeme[
>>> char_('0')
>>> | (char_('1') >> -(digit >> -digit))
>>> | (char_('2')
>>> >> -((char_('0', '4') >> -(digit))
>>> | (char_('5') >> -(char_('0', '5')))
>>> | char_('6', '9')
>>> )
>>> )
>>> | (char_('3', '9') >> -digit)
>>> ];
>
> And if I "flatten" the alternatives, I end up with a "match", but the
> attributed match looks really odd, which is a post-parser error of its
> own:
>
> lexeme[
> char_('0')
> | (char_('1') >> -(digit >> -digit))
> | (char_('3', '9') >> -digit)
> | (char_('2') >> char_('5') >> -char_('0', '5'))
> | (char_('2') >> char_('0', '4') >> -digit)
> | (char_('2') >> char_('6', '9'))
> //| (char_('2')
> // >> -(
> // (char_('0', '4') >> -digit)
> // | (char_('5') > -char_('0', '5'))
> // | char_('6', '9')
> // )
> // )
> ];
>
> The Address rule itself has not changed, is still incorporating this
> Octet rule, with test failures:
>
> FAILED:
> CHECK_THAT( y, Equals(x) )
> with expansion:
> "84.2244.31.76" equals: "84.244.31.76"
> ^^^^^^^
> with message:
> Verifying address >>> 84.244.31.76 <<<
>
> FAILED:
> CHECK_THAT( y, Equals(x) )
> with expansion:
> "251.170.2248.119" equals: "251.170.248.119"
> ^^^^^^^
> with message:
> Verifying address >>> 251.170.248.119 <<<
>
> FAILED:
> CHECK_THAT( y, Equals(x) )
> with expansion:
> "87.119.108.224" equals: "87.119.108.24"
> ^^^^^^^
> with message:
> Verifying address >>> 87.119.108.24 <<<
>
> FAILED:
> CHECK_THAT( y, Equals(x) )
> with expansion:
> "160.3.2230.143" equals: "160.3.230.143"
> ^^^^^^^
> with message:
> Verifying address >>> 160.3.230.143 <<<
>
> What am I missing here? I'm sure I probably need to invoke something
> like hold[] or lexeme[], beyond what the Octet pattern itself is
> already doing?
>
>>> Which is presently failing to find all three digits in the
>>> "20\d"-"24\d" use case.
>>>
>>> Octet itself parses correctly when I use this rule:
>>>
>>> _octet_rule >> !(char_);
>>>
>>> My octet test cases verify the entire range of 0-255 successfully, as
>>> well as the following invalid cases:
>>>
>>> VerifyOctet(-1, invalid_);
>>>
>>> VerifyOctet(1990, invalid_);
>>> VerifyOctet(2550, invalid_);
>>> VerifyOctet("255.", invalid_);
>>> VerifyOctet(256, invalid_);
>>> VerifyOctet(999, invalid_);
>>>
>>> VerifyOctet("abc", invalid_);
>>> VerifyOctet(123.456, invalid_);
>>>
>>> Converted to corresponding string, i.e. "1990" or "123.456".
>>>
>>> But when I combine that into the IP address rule, the test results occur:
>>>
>>> _octet_rule >> repeat(3)[char_('.') >> _octet_rule];
>>>
>>> And the test results:
>>>
>>> CHECK_THAT( y, Equals(x) )
>>> with expansion:
>>> "145.23.183.47" equals: "145.231.183.47"
>>> with message:
>>> Verifying address >>> 145.231.183.47 <<<
>>>
>>> FAILED:
>>> CHECK_THAT( y, Equals(x) )
>>> with expansion:
>>> "189.21.29.11" equals: "189.211.29.11"
>>> with message:
>>> Verifying address >>> 189.211.29.11 <<<
>>>
>>> FAILED:
>>> CHECK_THAT( y, Equals(x) )
>>> with expansion:
>>> "90.60.32.24" equals: "90.60.32.246"
>>> with message:
>>> Verifying address >>> 90.60.32.246 <<<
>>>
>>> FAILED:
>>> CHECK_THAT( y, Equals(x) )
>>> with expansion:
>>> "22.59.144.140" equals: "223.59.144.140"
>>> with message:
>>> Verifying address >>> 223.59.144.140 <<<
>>
>> Perhaps I should have more test cases represented as well, because
>> when the 2's is the single digit, I get a space interpreted. Which is
>> starting to look like a "skipper" issue to me?
>>
>> FAILED:
>> CHECK_THAT( y, Equals(x) )
>> with expansion:
>> "22.23.2 .152" equals: "221.236.2.152"
>> ^^^^
>> with message:
>> Verifying address >>> 221.236.2.152 <<<
>>
>>> IP address test data is randomly generated, or combinatorially if I
>>> wanted to wait for all 4.3B or so test cases. However, I figure that
>>> as long as the Octet rule is working, it is reasonable to expect the
>>> Address rule may also work.
>>>
>>> Not sure why those last digits are being dropped, however. Or how the
>>> 1's can succeed where the 2's are failing.
>>>
>>> Any suggestions, Spirit folks?
>>>
>>> Thanks!
>>>
>>> Cheers,
>>>
>>> Michael Powell


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net