Subject: Re: [boost] [histogram] Formal review
From: Hans Dembinski (hans.dembinski_at_[hidden])
Date: 2018-09-24 21:52:47
> On 23. Sep 2018, at 16:26, Bjorn Reese via Boost <boost_at_[hidden]> wrote:
> Boost.Histogram contains useful and well-designed features, like excess
> (over/underflow) bins, and axis transforms to name a few.
> As this is a review, I am going to spent most time below on critical
> scrutiny. [â¦]
thank you very much for the great review and the time spend with the library. Some discussion below.
> I. DESIGN
> Iterators are const. This prevents us from bootstrapping the histogram
> with a prior distribution using STL algorithms like std::fill(),
> std::generate(), or std::sample().
Writable iterators could be added, but I am not convinced that is a good idea.
The histogram iterators are const, because I cannot imagine a use-case where the histogram has to be bootstrapped with a prior distribution. Only providing const iterators avoids the misconception that a histogram is a container. Histogram can make stronger guarantees about its internal consistency, if users cannot set bin counters arbitrarily.
> The adaptive_storage does not work well with STL algorithms, because
> weight_counter is not an arithmetic type. See the Implementation section
> below for more detail. I therefore propose that the default storage
> policy is changed to something without weight_counter.
There are two solutions, right? Turing weight_counter into a full arithmetic type or removing support for weight_counter in the default storage.
> The adaptive_storage has two responsibilities: data compaction and
> weights. Would it be possible to split this into two separate storage
> policies? I have no real use for arrival variance.
Agreed, it should be a compile-time option for adaptive_storage.
> I am not too fond of using operator() for insertion. The code looks like
> a function call with side-effect. I prefer an explicitly named member
Originally, I had a special method called fill(), then I switched to operator(). There are two arguments in favour of operator():
- a 1D histogram can be filled with std::for_each or std::accumulate
- consistency with Boost.Accumulators, which also use operator()
> The axis::ouflow_type is oddly named. I suggest that this is renamed to
> something like axis::excess_type.
I am ok with renaming, but I think we haven't found the best name yet. What do you think about axis::tails_type?
> II. IMPLEMENTATION
Agreed, the missing bits will be implemented and unit-tested.
> III. DOCUMENTATION
Agreed, documentation will be expanded in this way.
> Boost.Histogram should be ACCEPTED into Boost, provided:
> 1. Reference documentation is finalized.
> I furthermore strongly recommend that the default storage policy is
> changed to something without weight_counter because of the various
> problems with STL algorithms. This recommendation is not a prerequisite
> for acceptance.