Boost logo

Boost :

From: Julien Blanc (julien.blanc_at_[hidden])
Date: 2020-09-19 10:01:25


Hi,

Here's my small review of boost json.

## Are you knowledgeable about the problem domain?

I've written a json library (for reference:
https://github.com/Julien-Blanc-tgcm/jbc-json ) which i used as
playground / experiments. I guess this makes me knowledgeable enough of
the domain.

## What is your evaluation of the design?

I think the library is well designed, although it has several
shortcomings. These shortcomings may not alter general usage of the
library (with main target seeming to be REST api over http), but render
it useless in specific cases.

For me, a json library consists of three independant parts and some
glue:
* a parser
* a type for representing values (usually a variant)
* a printer (serializer)

### The parser

In boost.json, the separation between the parser and seems neat enough
(ie, you can use the parser without the json::value type). However, it
resides in a <detail> header, which would indicate that it is not an
expected use case. I believe it should be.

The callbacks of the parser (push parser) are not documented. I'm a bit
puzzled by what the on_number_part does, and how it fits with the
on_int64/on_double callbacks, for example.

The parser is an incremental push parser. However, it seems to expect
to be feeded with partial json documents, but not consecutive json
documents.

I could not find a way to make :

    auto v = p.write(R"json({"foo":1}{"bar":2})json");

return 9.

I may have missed something, but this use case is IMHO important to
address. Since the only way to know that a json document has ended is
to parse it, the parser should provide an option to parse a stream
containing multiple consecutive json documents (ie, stopping without
error at the end of each document). A common use case for this is JSON
notifications received via raw sockets, so no envelope may be used to
isolate JSON documents.

### The value type

The value type is a variant, the kind that you would expect from a json
library. Most of the interface uses string_view, which is a very good
thing (the fact that boost.json uses another string type internally has
not been an issue in my tests, due to the usage of string_view).

It lacks visitation, though (IIRC there’s an open issue for this).

The fact that this value type has been highly optimized is the great
strength of this library.

#### objects

The object interface relies on the [] operator, which incurs the same
shortcomings as with std::map :

auto v = foo["bar"]; // will silently create bar if it does not exists

Typos in keys had always been a source of hard to find bugs, i wished
there was an option to disable operator[] and force the use of find.

The library has no support for object with duplicate keys, which also
are valid json. This, however, can be worked around by using the parser
callbacks and not the variant (but that makes it difficult to produce
such documents, cf later).

Whether the order of the keys is kept by the object structure or not
does not appear in the documentation. From my small testing, it seems
that it is the case, but the api does not offer any interface to add
elements at specific position, so i would assume it is not guaranted.
The json specification, AFAIR, does not forbid to rely on a specific
order of the keys. Some serialization libraries do expect the keys to
be in a specific order.

#### numbers

Having two number types can lead to unexpected results :

    auto v = boost::json::parse("{\"foo\":1,\"bar\":2.0}");
    std::cout << v.as_object()["bar"].as_double() << std::endl; // ok
    std::cout << " " << v.as_object()["foo"].as_double(); // asserts

It is unexpected, because several json libraries will never output
“2E0” or “2.0” for the value “2”, but a plain “2”. It means that
whenever you expect a double value, you also have to check for the
possibility that it has been parsed as an integer. The to_double
function should silently convert integers to their double equivalent
whenever it is possible.

* as has already been noted by others, the number precision is
hardcoded to IEE754 or 64bits integer. This is seldom an issue, since
it is the case with most implementations.

#### The printer

The printer is not usable without the value item. I would expect it to
be (producing json without resorting to an intermediate
representation).

Apart from this, it supports incremental printing, which is what is
expected.

It produces a strange representation for numbers, but it is conforming
to the JSON specification.

Because it is tied to the json::value, it cannot produce anything that
cannot be represented by a json::value. This is an issue that should be
addressed.

## What is your evaluation of the implementation?

I did not have a thorough look into the implementation details. From
what i have seen, the implementation looks fine. A parser is usually a
landmine for security exposures, and would need a very careful review
which i cannot do. IIRC the authors plan on having an independant audit
on the library, which is a good thing. There is also fuzzing and tests,
so we can expect that this point is correctly being taken care of.

The rest of the code looks fine, easy to navigate into.

## What is your evaluation of the documentation?

Very good. Easy to use, complete. There is some missing information,
but they all seems related to design decision that have been taken
implicitly (already discussed in design section).

An example on using the parser on standalone mode would be a very
welcome addition.

Another thing that the doc should state clearly : boost.json is **not**
a JSON (ECMA-404) library, but rather an I-JSON (RFC 7493) library with
some extensions.

## What is your evaluation of the potential usefulness of the library?

Very useful. While it does not address all use cases (and does not try
to), it addresses what is probable the *most common* case. As JSON is
very widespread, i expect this library to find a large audience,
because it combines ease of use and performance.

## Did you try to use the library? With what compiler? Did you have any
problems?

I tried to use it on a small toy application, on a debian stable
system.

The main target seems to be to use the library with boost. I ran into
several issues when trying to use the standalone version.

The first one was easy to address, and due to gcc 8.3 lacking
c++17 support. A patch was submitted, and merged quickly.

The second one was incompatibilites between standalone asio bringing
boost headers (because boost is installed on my machine) and json
redefining them (because of standalone mode), resulting in multiple
defenitions of throw_exception. I'm not sure yet if i should file this
to boost.json, asio library, or asio debian package, or if it is just a
missing define on my part. At that point, i switched to boost json,
which fixed the issues i had.

## How much effort did you put into your evaluation?

I'd say around 10 hours reading, testing and playing, including writing
of the review.

## Do you think the library should be accepted as a Boost library?

I think the library should be ACCEPTED.

There are shortcomings in the library, it does not address every use
case in the world, but it does a very good job at what it is designed
to do. It is easy to use, performant, and it addresses a very
widespread use case, so the pros clearly outweighs the cons.

Regards,

Julien


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk