Boost logo

Boost :

From: Ruben Perez (rubenperez038_at_[hidden])
Date: 2023-02-08 10:12:27


I wanted to raise a concern on allocations before emitting my review.
I've gone ahead and modified the html.cpp example to install a "spying"
memory_resource as a quick and dirty way of visualizing allocations.
I've refactored the code to something like this:

// Let's call this the "init" section
boost::json::value vars = boost::json::value_from(ref, spy_ptr);
boost::json::value partials (
    { { "header", header }, { "footer", footer },
        { "item", item }, { "body", body } },
    spy_ptr
);

// "renderer_ctor" section
boost::mustache::renderer rd (std::move(vars),
std::move(partials), spy_ptr);

// "parsing" section
rd.render_some( html, dummy );
rd.finish( dummy );

This yielded the following results on allocations:

section | total bytes | total allocations
-------------------------------------------------
init | 1550 | 31
renderer_ctor | 1582 | 32
parsing | 1900 | 39

I was surprised by the number of allocations that the
actual parsing does. Is this because renderer::render_some
must assume that the input buffer may not outlive the actual
parsing?

If I am not mistaken, renderer_ctor's allocations happen
because value_from is invoked unconditionally, even if
a json::value&& is passed in, which yields in a copy.

My other point to note is about HTML escaping. I want to note
this point as a reference for other reviewers, as the author
is already aware of this.

Some special characters should be HTML encoded, additionally to the ones that
are being escaped today. I've reviewed other Mustache libraries looking for
how did they handle this topic and whether they had any known vulnerability.
The most active one seems the JavaScript one
(https://github.com/janl/mustache.js). Looking into the Synk vulnerability
database, I've found https://security.snyk.io/vuln/npm:mustache:20151207
(CVE-2015-8862), which leads to XSS exploits and does affect Boost.Mustache,
too.

Two exploits are possible:
* The backtick character seems to have special meaning in ancient versions
  of IE (https://html5sec.org/#59, https://github.com/janl/mustache.js/pull/388)
* When using HTML unquoted attributes (e.g. <a href={{user_data}}>), escaping
  the equal sign can mitigate the risk of attribute injection. (IMHO attributes
  without quotes make kittens die, but this is listed as a vulnerability).

The JavaScript library performs this replacement:
* Backtick is replaced by &#x60;
* Equal sign is replaced by &#x3D;

They also escape the forward slash (#x2f;). I haven't found a vulnerability
to point you too, but I guess that could be added as additional hardening,
if you like.

Many thanks,
Ruben.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk