Boost logo

Boost :

Subject: Re: [boost] Formal Review of Proposed Boost.Histogram Library Starts TODAY
From: Hans Dembinski (hans.dembinski_at_[hidden])
Date: 2018-09-26 12:30:05


Hi Alex,

> On 26. Sep 2018, at 12:35, <a.hagen-zanker_at_[hidden]> <a.hagen-zanker_at_[hidden]> wrote:
>
> I did some further hands on trying. In particular I implemented my own axis type and used it with the dynamic historgram variant.
>
> 1. I think axis::any is a misnomer. We normally use the word "any" to indicate type erasure. So I would expect axis::any to be able to hold any type that implements the axis concept. Instead, axis::any derives for boost::variant and can hold an axis from a discrete set of types provided by the user. This bites when using your own axis_type in a dynamic histogram, because it means you then also have to specify your own variant of any_std. E.g. I was required to do the following:
>
> using my_any = bh::axis::any<bh::axis::regular<>, bh::axis::category<std::string>, my_axis>;
> std::vector<my_any> axes;
>
> You could rename your axis::any to axis::variant, however I think it would be better to more fully achieve type erasure so you can store your dynamic axes as a std::vector<axis::any> and dramatically simplify usage.

you don't need to define my_any yourself, if you use the make_*_histogram factories, as shown in the user guide. Those work automatically with custom types.

I am open to renaming axis::any, but isn't axis::variant equally confusing? I avoided the name "variant", because axis::any does more than a normal variant, it implements the common interface of the builtin axis types to provide some convenient polymorphism. axis::any cannot represent all aspects of all axes, but it works for simple cases at least.

> 2. The bin_type concept is very opaque to me. And, in the end the only two member functions I was required to implement to make my bin_type work were:

The concepts need better documentation, sorry about that, and so does the section about implementing your own axis types.

> std::string value() const { return is_negative_bin ? "negative" : "positive"; }
> operator int() const { return is_negative_bin ? 0 : 1; }
>
> "value()" seems to be a misnomer, since it is used to print a label for the bin (in your second example: getting_started_listing_02.cpp line 41). To me it is not logical for a bin to have a value() member function, since a bin really is a domain of values.

That depends on the axis. axis::category has bins which represent a single value, not a continuum. That's the example you are referring to. Same for axis::integer. The other axes types use a different kind of bin type that represents an interval. That's why there are two helper templates interval_view and value_view. Use the former for axis types whose bins represent an interval and the latter for axis types whose bins represent a single item.

> operator int() seems inappropriate because why would a bin need to know its index? That should logically be the responsibility of the axis

I explained the benefit in an answer to Steven.

> 3. I think this was covered before, but the inheritance from labeled_base is awkward. I am not just required to inherit from it, but also construct it in the appropriate form.
>
> I ended up constructing the base class with:
>
> my_axis() : labeled_base(2, bh::axis::uoflow_type::off, "positive vs. negative", std::allocator<char>{})
> { ...
>
> It seems too onerous to require my own axis to know the number of bins on construction. What if I want to add bins one at a time? What if my number of bins is based on some function?
> I also really don't like having to bother with std::allocator<char>; I normally stay away from that kind of complexity. Why should using histograms force me to start thinking about allocators?

The allocator argument can get a default so you don't have to see it. To be honest, I didn't think someone would want to make axis from scratch. Usually, you want to bend behaviour of an existing axis a bit to achieve what you want, and that is much simpler, see the example in the user guide.

> 4. Using iterator_over might not be efficient
>
> If my axis stores bin_types in a std::vector (which seems reasonable), why can I then not just use the iterator of std::vector? You could achieve that by leaving it up to the user to implement the iterators, and offer iterator_over as the default..
>
> I still think histogram should be accepted, but in my opinion there remains quite some scope to tidy up the concepts to make working with histograms as simple as possible.

I am thankful for your input on the advanced usage of the library, and of course that should be as simple as possible and it needs better documentation, but you cannot expect it to be as simple as the basic usage. A good library should make simple things simple and complicated things possible.

Best regards,
Hans


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk