Boost logo

Boost :

From: Maximilian Riemensberger (riemaxi_at_[hidden])
Date: 2021-03-07 21:54:27


# Review of Describe

## Recommendation

ACCEPT

The library appears to be well designed, well-written and well-documented. I
can see quite a few use cases in my work environment where it would be useful.
Thank you, Peter, for sharing it and proposing it to boost.

## Design

The design appears solid to me. Describing types seems easy and convenient.
The `mod_*` flags are a bit unexpected at first. However, going through all of
the possible combinations, I'm not sure if having lots of separate functions to
retrieve each combination is any better.

## Implementation

I only briefly skimmed over the implementation. It looks clean and organized.
However, I have little experience with this kind preprocessor foo.

## Documentation

The documentation is well-written and clear. I would have liked some
explanation on the limitations of the library, not just a list of what is not
supported. But Peter clarified this via email. Maybe it could be added to the
documentation for everybody else.

## Usefulness

I consider the library useful. I have went through our code base at work and
there is quite a few things where it could come in handy. Obviously enums and
printing. But also the JSON RPC example caught my eyes as we use a little JSON
RPC as well. However, when going through our use cases, I found that describe
often provides just a part of what is necessary to make it really useful.

I consider none of the following points to be critical in hindering the
acceptance of the library to boost as it is already pretty useful without those
things. The scope of the library is ultimately up to those doing the hard work
of designing, implementing , documenting and maintaining it. Also, I think most
of it can be explored in the future independently if there is sufficient interest.

### Amount of reflection data and annotations

The library aspires to be the one standard tool to *describe* your
(user-defined) types and enums. Yet it "only" provides a mechanism to reflect
on base and member name and the respective pointer. It does not provide
reflection on the name of the type being described, nor does it provide
annotations (comparable to language attributes), nor does it provide a
mechanisms to describe free functions and function parameters (free and
member). I understand that not every thing of this list is in scope or even
doable given the current state of C++. But at least the first two points would
be pretty nice. In particular, the first one (name of the type) is not clear
to me why it is not provided since it already has to be named in the
`BOOST_DESCRIBE_*` macro anyway.

#### Annotations

In a previous email I showed the `error_code` example. Others have showed
serialization use cases. It comes up every-time additional data, a custom
string (other than the name, which is restricted to be a valid identifier), or
another type needs to be associated with the member or enumerator.

Just to list some example:

* error_code messages that are not identical to the error code enum identifier
* mapping custom error codes onto error codes from other categories (like std,
   boost, http status codes, etc.)
* conditional exclusion of fields from serialization
* serializing to fields with names that are not valid c++ identifiers
   (Content-length comes to mind, although this one can be worked around with
   `_`)
* annotating member functions with parameter names to support JSON RPC with
   objects, see also below

I believe will grow as people start using describe.

#### Reflection of the name of a described type

As far as I understand, (boost/std) typeinfo cannot be used at compile-time (e.g.
the name of the type). But if I describe my type, I would expect to have at
least the name of the type available at compile-time as a string. So describe
could provide at least this tiny bit of type info at compile-time (see also
first item).

This is probably out of scope, and I guess there is also problems with
namespaces etc. But, I thought to bring it up as well. In particular, because
if I use describe, I have to name the type and give it to the preprocessor
already:

     struct A {
         std::string x = "x is real";
     };
     BOOST_DESCRIBE_STRUCT(A, (), (x))

     struct B {
         A a{};
         int i = 4;
     };
     BOOST_DESCRIBE_STRUCT(B, (), (a, i))

So it would be nice to have also information about the type available, even if
it's just the name.

The universal print example prints

     {.i = 4, .a = {.x = x is real}}

but I would like to print

     B{.i = 4, .a = A{.x = x is real}}

and, if I describe `std::string` and `int` (ok, this might be insane), even

     BOOST_DESCRIBE_STRUCT(int, (), ());
     BOOST_DESCRIBE_STRUCT(string, (), ());

prints

    B{.i = int{4}, .a = A{.x = string{x is real}}}

To sum it up: If I describe a type `B`, I'd like to have all the information
about the type available including it's name, it's members and bases, etc.

#### Describing member functions and their arguments

When I first saw the JSON RPC example, I thought that I could use it
immediately. But taking a closer look revealed that it only supports parameter
arrays and not parameter objects. And thinking about it there's really a
fundamental problem. I can describe the member functions, but I cannot
annotate or describe it's parameters to give them names. This is probably
again out of scope. But considering describes use cases and examples, it looks
like a next step towards library based reflection.

### Convenience tools for the common use cases

The library provides a pretty minimal and convenient syntax to describe
types. However, using those descriptions often seems to involve various
degrees of heavy meta-programming. I think the library could provide a
small set of convenience feature in this area for the common cases. I
believe it would also help a great deal in adoption since I guess that
the meta-programming wall even for very basic things might drive some users
away. And whenever the default convenience tools is not good enough for a
particular use-case it serves as a starting point to build your own version
which is slightly different. I would consider for example `enum_to_string`,
`string_to_enum`, `for_all_enumerators`, `get_member_by_name`,
`get_member_by_pointer`, `for_all_members`, etc. such basic convenience
tools.

### Interaction with PFR and data member offsets and data member index

PFR is about **index**-based access to data member of aggregates. Describe is
about **name**-based access to members of class-types. This view is certainly
a bit over-simplified. But I think it has a certain truth to it. Now, it
would be really awesome (and I have no idea if this is possible) if those two
could interact.

Consider the following example:

     struct A {
         std::string x = "x is real";
     };
     BOOST_DESCRIBE_STRUCT(A, (), (x))

     struct B {
         A a{};
         int i = 4;
     };
     BOOST_DESCRIBE_STRUCT(B, (), (i, a)) // wrong order?!

Now I can clear `boost::pfr::get<...>` those members. For example,
`boost::pfr::get<0>(b)` returns `A&`. And I can `describe_members` them as
well. Less convenient, but I can get a `A B::*` pointer. However, in an ideal
world, I could match the pfr index to the member description ad vice versa.

Also for all non-aggregates, boost describe could provide an alternative index
based access if this is possible. As far as I know gcc on linux implements
pointer to data members via offsets. Probably, all other compilers do
something similar at least on the common platforms. So those offset could be
sorted to check the order of data members in the describe macro and provide
index based access.

Also considering potential compiler-based describe, I would hope that the
compiler provides information beyond just the name and the pointer, e.g.,
including index and offset for data members for example. At least for data
types where those are/can be well defined. Layout order is/should be mandated
anyway, so that the index and offset would make sense (ignoring
`no_unique_address` and other corner cases).

### Who provides the description for boost and std libraries?

For my types it's clear that I provide the description. Just after defining
(or even inside) the enum, struct, class.

But what about boost and std types. For example `boost::system::errc::errc_t`
and `std::errc` seem obvious targets. There is probably quite a few more types
where a description might be reasonable. Is it safe if everybody does it,
maybe even slightly differently by including or excluding certain
members/enumerators or ordering them in different ways? The only alternative
to that mess that I can see is that `boost` or `boost::describe` provides at
least the common ones optionally for convenience.

## Details about the review and reviewer

- Did you try to use the library? With which compiler(s)? Did you
     have any problems?

gcc-8 on debian. No Problems.

- How much effort did you put into your evaluation? A glance? A quick
reading? In-depth study?

Approx 10 hours for documentation and playing with the library. I did only
skim over the implementation.

- Are you knowledgeable about the problem domain?

 From a users perspective.

Regards,
Max


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk