Boost logo

Boost :

Subject: Re: [boost] [histogram] Formal review
From: Hans Dembinski (hans.dembinski_at_[hidden])
Date: 2018-09-24 21:52:47


Dear Bjørn,

> On 23. Sep 2018, at 16:26, Bjorn Reese via Boost <boost_at_[hidden]> wrote:
>
> Boost.Histogram contains useful and well-designed features, like excess
> (over/underflow) bins, and axis transforms to name a few.
>
> As this is a review, I am going to spent most time below on critical
> scrutiny. […]

thank you very much for the great review and the time spend with the library. Some discussion below.

> I. DESIGN
> ---------
>
> Iterators are const. This prevents us from bootstrapping the histogram
> with a prior distribution using STL algorithms like std::fill(),
> std::generate(), or std::sample().

Writable iterators could be added, but I am not convinced that is a good idea.

The histogram iterators are const, because I cannot imagine a use-case where the histogram has to be bootstrapped with a prior distribution. Only providing const iterators avoids the misconception that a histogram is a container. Histogram can make stronger guarantees about its internal consistency, if users cannot set bin counters arbitrarily.

> The adaptive_storage does not work well with STL algorithms, because
> weight_counter is not an arithmetic type. See the Implementation section
> below for more detail. I therefore propose that the default storage
> policy is changed to something without weight_counter.

There are two solutions, right? Turing weight_counter into a full arithmetic type or removing support for weight_counter in the default storage.

> The adaptive_storage has two responsibilities: data compaction and
> weights. Would it be possible to split this into two separate storage
> policies? I have no real use for arrival variance.

Agreed, it should be a compile-time option for adaptive_storage.
https://github.com/HDembinski/histogram/issues/75

> I am not too fond of using operator() for insertion. The code looks like
> a function call with side-effect. I prefer an explicitly named member
> function.

Originally, I had a special method called fill(), then I switched to operator(). There are two arguments in favour of operator():
- a 1D histogram can be filled with std::for_each or std::accumulate
- consistency with Boost.Accumulators, which also use operator()

> The axis::ouflow_type is oddly named. I suggest that this is renamed to
> something like axis::excess_type.

I am ok with renaming, but I think we haven't found the best name yet. What do you think about axis::tails_type?

> II. IMPLEMENTATION
> —————————
> […]

Agreed, the missing bits will be implemented and unit-tested.

> III. DOCUMENTATION
> —————————
> […]

Agreed, documentation will be expanded in this way.

> Boost.Histogram should be ACCEPTED into Boost, provided:
>
> 1. Reference documentation is finalized.
>
> I furthermore strongly recommend that the default storage policy is
> changed to something without weight_counter because of the various
> problems with STL algorithms. This recommendation is not a prerequisite
> for acceptance.

Thanks!


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk