Boost logo

Boost :

From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2019-10-22 20:21:05


On Tue, Oct 22, 2019 at 12:55 PM Mateusz Loskot via Boost
<boost_at_[hidden]> wrote:
> I'd consider covering the thing with https://google.github.io/oss-fuzz/ instead.

My strategy for ensuring correctness is two-fold. First, as with
Beast, it will be reviewed by an external company (they will do the
fuzzing). Second, are the special tests I write so that I have
confidence everything works. This methodology is as follows:

* Create a set of representative test vectors (examples of correct and
invalid inputs)
  I have written my own inputs, and I have imported these test:
  <https://github.com/nst/JSONTestSuite/tree/master/test_parsing>

Then, for each test vector:

* Parse the input as one string and verify the output

* Loop over every possible location that the input may be split into
two pieces, parse the input as two individual pieces, verify the
output. Code:

<https://github.com/vinniefalco/json/blob/cb348218345cfe2bea09d4a8ca8ea4c0f13f8bc3/test/basic_parser.cpp#L32>

Then, for each possible split point also perform these algorithms:

* Using a special allocator (`fail storage`) which throws after N
calls to allocate, attempt to parse the input in a loop where N starts
out at 1 and is incremented on each allocation failure. The test
succeeds if the loop exits after a maximum number of iterations and
the output is verified correct. Code for this failing allocator is
here:
    <https://github.com/vinniefalco/json/blob/cb348218345cfe2bea09d4a8ca8ea4c0f13f8bc3/test/test.hpp#L70>

* Using a special parser (`fail_parser`) which returns an error after
N calls to the parser's SAX API, attempt to parse the input in a loop
where N starts out as 1 and is incremented on each failure. The test
succeeds if the loop exits after a maximum number of iterations and
the output is verified correct. Code for this failing allocator is
here:

<https://github.com/vinniefalco/json/blob/cb348218345cfe2bea09d4a8ca8ea4c0f13f8bc3/test/test.hpp#L206>

These tests are run under valgrind, address sanitizer, undefined
behavior sanitizer, and code coverage. Then I look at the code
coverage to find uncovered or partially covered lines, and devise
individual tests to ensure that code is exercised. By now, there are
only a handful of such lines if that.

With these techniques I achieve close to 100% code coverage and very
high confidence that every path through the parser is correct. After a
bunch of testing (which consists of telling users it is "ready" and
seeing what they report back) I submit it to the external code
auditing company to get a report. After fixing any issues they raise
in the report, my strategy changes: touch the code as little as
possible. If this code used an external dependency, and that upstream
code changed, then transitively it means my code changed - for this
reason I avoid using external code like Spirit (or regex) even if it
means I have to duplicate stuff.

Thanks


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk