Boost logo

Boost :

From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2020-09-23 11:55:35


On 23/09/2020 10:04, Hans Dembinski via Boost wrote:
>
>> On 22. Sep 2020, at 16:23, Peter Dimov via Boost <boost_at_[hidden]> wrote:
>>
>> But in the meantime, there are people who have actual need for a nlohmann/json (because of speed) or RapidJSON (because of interface) replacement, and we don't have it. It doesn't make much sense to me to wait until 2032 to have that in Boost.
>
> My rough count of accept votes indicates that Boost.JSON is going to be accepted, so you get what you want, but I feel we gave up on trying to achieve the best possible technical solution for this problem out of a wrong sense of urgency (also considering the emails by Bjørn and Vinícius I does not seem like we need to wait for 2032 for a different approach).

For the record, I've had offlist email discussions about proposed
Boost.JSON with a number of people where the general feeling was that
there was no point in submitting a review, as negative review feedback
would be ignored, possibly with personal retribution thereafter, and the
library was always going to be accepted in any case. So basically it
would be wasted effort, and they haven't bothered.

I haven't looked at the library myself, so I cannot say if the concerns
those people raised with it are true, but what you just stated above
about lack of trying for a best possible technical solution is bang on
the nail if one were to attempt summarising the feeling of all those
correspondences.

Me personally, if I were designing something like Boost.JSON, I'd
implement it using a generator emitting design. I'd make the supply of
input destructive gather buffer based, so basically you feed the parser
arbitrary sized chunks of input, and the array of pointers to those
discontiguous input blocks is the input document. As the generator
resumes, emits and suspends during parse, it would destructively modify
in-place those input blocks in order to avoid as much dynamic memory
allocation and memory copying as possible. I'd avoid all variant
storage, all type erasure, by separating the input syntax lex from the
value parse (which would be on-demand, lazy), that also lets one say "go
get me the next five key-values in this dictionary" and that would
utilise superscalar CPU concurrency to go execute those in parallel.

I would also attempt to make the whole JSON parser constexpr, not
necessarily because we need to parse JSON at compile time, but because
it would force the right kind of design decisions (e.g. all literal
types) which generate significant added value to the C++ ecosystem. I
mean, what's the point of a N + 1 yet another JSON parser when we could
have a Hana Dusíková all-constexpr regex style JSON parser?

Consider this: a Hana Dusíková type all-constexpr JSON parser could let
you specify to the compiler at compile time "this is the exact structure
of the JSON that shall be parsed". The compiler then bangs out optimum
parse code for that specific JSON structure input. At runtime, the
parser tries the pregenerated canned parsers first, if none match, then
it falls back to runtime parsing. Given that much JSON is just a long
sequence of identical structure records, this would be a very compelling
new C++ JSON parsing library, a whole new better way of doing parsing.
*That* I would get excited about.

Niall


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk