Boost logo

Boost :

From: Rainer Deyke (rdeyke_at_[hidden])
Date: 2023-02-07 13:29:31


On 05.02.23 04:55, Klemens Morgenstern via Boost wrote:
> The formal review of the mustache starts today, Sunday the 5th, and ends
> next week on Tuesday the 14th.
>
> The library was developed & submitted by Peter Dimov.
>
> Boost.Mustache is an implementation of Mustache templates in C++11.
>
> The master branch is frozen during the review and can be found here
> https://github.com/pdimov/mustache
>
> The current documentation can be found here:
> https://pdimov.github.io/mustache/doc/html/mustache.html
>
> Boost.Mustache is an implementation of Mustache templates in C++11.
>
> The documentation of Mustache templates is here: https://mustache.github.io/
I have some serious reservations about Mustache as a text templating
language, and some even more serious reservations about the minimal (no
extensions) version of Mustache used by the proposed library. In
particular, the lack of the lambda extension severely hurts the
versatility of this library, and because boost::json::value is used as
the basic vocabulary type, there is no good way to add this extension to
the library.

Reservation 1: The plural problem
---------------------------------

One fundamental applications of a text templating language is to format
a user-visible text string with values substituted in, like the following:

    "Found %d files and %d directories."

Now, one problem with the above is that it uses the plural forms "files"
and "directories" even if the number is 1, which is grammatically
incorrect. I can kind of work around this by using text like this:

    "Found %d file(s) and %d directorie(s)."

But that's ugly, it doesn't work for irregular plurals, and it
definitely doesn't work for languages like Arabic where almost all
plurals are irregular and there is a dual form in addition to the
singular and the plural.

The Jinja templating language solves this problem with conditionals:

   "Found {{num_files}} {% if num_files == 1 %}file{% else %}files{%
endif %} and {{num_dirs}} {% if num_dirs == 1 %}directory{% else
%}directories{% endif %}"

That's ugly, but it solves the problem at the templating level. It can
be made prettier by using macros, but it's good enough to demonstrate
that the problem can be solved

Can this problem be solved effectively at a different level? Maybe, but
the alternative is likely to be even uglier. Three alternate solutions
come to mind, none of them good. The first is to just prepare multiple
versions of the text string:

    "Found {{num_files}} files and {{num_dirs}} directories."
    "Found 1 file and {{num_dirs}} directories."
    "Found {{num_files}} files and 1 directory."
    "Found 1 file and 1 directory."

This obviously does not scale, as it requires n**m different strings,
where n is the number of grammatical numbers in the language (2 for
English) and m is the number of variables being substituted.

Another possibility is to pass in the words being pluralized as arguments:

   "Found {{num_files}} {{files_word}} and {{num_dirs}} {{dirs_word}}."

This works in this simple case for English, but it breaks down when more
complex grammatical structures are used, e.g. "1 files *has* been found"
vs "2 files *have* been found". It also pushes the hard work into the
calling code, which is less than ideal, especially if it means that the
calling code now has to contain language-specific code.

A third possibility, specifically when generating HTML code, is to
generate Javascript to perform the pluralization on the client side.
This is too complicated to demonstrate in a quick example, only works if
the result is HTML (or something equivalently powerful), and only works
if Javascript is enabled.

Is it possible to solve this problem with the lambda extension? Kind
of. The lambda extension would allow templates like the following:

   "Found {{num_files}}
{{#choose_plural}}num_files|file|files{{/choose_plural}} and
{{num_dirs}}{{#choose_plural}}num_dirs|directory|directories{{/choose_plural}}."

However, this puts the onus on the 'choose_plural' function to parse the
section string, look up the controlling variable, and use the result to
choose the correct string. This means that the choose_plural function
ends up duplicating a lot of the functionality of the library. It's a
big ugly hack, but it's a big ugly hack that works, which beats not
having any solution at all.

Reservation 2: The trailing comma problem
-----------------------------------------

Mustache sections can be used to list array elements in text. This is
one of the primary advantages of using a full templating system like
Mustache over something like boost::format or std::format. However, it
does not allow a separator between the elements of the array. I can
list the elements without separator:

   "Found these files: {{#files}}{{.}}{{/files}}."

However, if I try to add a separator, I also get a trailing separator at
the end:

   "Found these files: {{#files}}{{.}}, {{/files}}and nothing else."

This is annoyance for human-readable text. It becomes more than an
annoyance when generating machine-readable text. This makes the
proposed library unsuitable for generating, for example, program code or
json. Jinja solves this problem with a built-in filter:

   "Found these files: {{files|join(', ')}}."

Is it possible to solve this problem with the lambda extension? Again,
kind of. I could define a function that allows this:

   "Found these files: {{#join}}files|, {{/join}}."

However, the function would again have to parse the section string,
extract the variable name, look up the variable, and use that to
generate the string. It's ugly an hack that works, which still beats
not having any solution at all.

Conclusion
----------

I don't think Mustache is a very good templating language. With the
lambda extension, there are ways to hack around its inadequacies.
Without the lambda extension, it feels too restrictive to be useful.
Since a C++ implementation that supports the lambda extension already
exists in the form of the mstch library, do we really want the more
restrictive proposed library in Boost? On the other hand, the lambda
extension is also fairly limited, and does not provide a clean solution
to either of the above problems. What is really needed is something
that goes beyond the lambda extension by allowing arbitrary expressions
to be evaluated and passed to the function, like this:

   "Found {{num_files}} {{pluralize(num_files, 'file, 'files'}} and
{{num_dirs}} {{pluralize(num_dirs, 'directory', 'directories')}}."

   "Found these files: {{join(files, ', ')}}.

I would love to be convinced that the proposed library is more useful
than it looks at first glance. Maybe there is a non-obvious way to use
the built-in functionality of Mustache to provide the desired results,
analogous to the way C++ template metaprogramming extended the
expressive power of C++ templates. However, if I am not convinced then
I expect to vote against its acceptance into Boost.

-- 
Rainer Deyke (rainerd_at_[hidden])

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk