Boost logo

Boost :

Subject: Re: [boost] [histogram] Formal review
From: Bjorn Reese (breese_at_[hidden])
Date: 2018-09-30 15:29:33


On 09/24/18 23:52, Hans Dembinski wrote:

> The histogram iterators are const, because I cannot imagine a use-case where the histogram has to be bootstrapped with a prior distribution. Only providing const iterators avoids the misconception that a histogram is a container. Histogram can make stronger guarantees about its internal consistency, if users cannot set bin counters arbitrarily.

If you are using histograms to collect data for later analysis, then
there may indeed not be much use for a prior distribution. However, if
you use histograms to make decisions on-the-fly, then you cannot always
wait for the histogram to be filled up with data in order to make
reliable decisions. In such cases a prior distribution can help.

Admittedly, I have not done this with histograms, but I do use CDFs
(well, quantile approximations) quite a lot for on-the-fly decision-
making.

The other use case is to restore the histogram with data from an
earlier run, but serialization already covers that.

> There are two solutions, right? Turing weight_counter into a full arithmetic type or removing support for weight_counter in the default storage.

True, but then I am going to play the zero-overhead card :)

> Originally, I had a special method called fill(), then I switched to operator(). There are two arguments in favour of operator():
> - a 1D histogram can be filled with std::for_each or std::accumulate
> - consistency with Boost.Accumulators, which also use operator()

Inserting data into the histogram with std::accumulate is odd.

An alternative is to use std::copy (or std::transform) with
std::inserter. That does requires a histogram::insert(iterator hint,
value_type value) where the hint is completely useless... but that
is what std::set is using.

> I am ok with renaming, but I think we haven't found the best name yet. What do you think about axis::tails_type?

Seems like you settled on options in another thread. I am fine with
that.

Related to this, I also like the extent() over shape(). This name is
also used by Boost.MultiArray and the std::mdspan proposal (P0009R7).
In the name of consistency, you may also consider renaming dim() to
rank().


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk