
Boost : 
Subject: [boost] [histogram] discussion of accessor interface design
From: Hans Dembinski (hans.dembinski_at_[hidden])
Date: 20190115 11:35:17
Dear all,
work on boost.histogram is progressing fast. I am still implementing feedback from the review, simplifying the interface, adding requested features, such as axes that can grow. STL compatibility was further improved, you can now also write to histograms via iterators.
The normal iterators make sense when you iterate over a 1D histogram. When you iterate over a multidimensional histogram, you also want to know the current multidimensional index.
After considering many options, and I really thought about this a lot, I went for the following design, which is a bit unusual. Therefore I would appreciate feedback. I think it is great, once you overcome an initial feeling of awkwardness.
The accessor class is "polymorphic", it behaves like a pointer to the histogram value, and like an array for the multidimensional index. In code, this is how you use it:
auto h = make_histogram(â€¦) // make 2D histogram
for (auto && x : indexed(h)) { // indexed produces a range of accessors
// x is a special accessor type, combining two nonoverlapping concepts
//  it acts like a pointer to the histogram value
//  it acts like an array to the current index
std::cout << "current value " << *x << std::endl; // "dereference" to get value
std::cout << "current index " << x[0] << " " << x[1] << std::endl; // use subscript operator to get index
}
This syntax is beautifully terse, e.g. see
https://github.com/HDembinski/histogram/blob/develop/examples/guide_access_bin_counts.cpp
for a full example, especially line 66.
Pros:
 really terse
 you can access methods on the pointee with x>method() as well! (useful when histogram counters are not PODs)
 you can iterate over x to get the indices, for (auto i : x) { â€¦ } works
 since x acts like an array to the indices, you can pass it to functions which accept ranges or iterators (it has .begin() and .end())
 x can be (and is) enhanced with other useful methods
* x.bin(N) returns the current bin interval for the Nth axis, allowing you to access the central value, width, edges
* x.density() returns the current density (bin value divided by product of current bin widths)
Cons:
 *x and x[0] do completely different things: *x gives you the bin value, x[0] gives you the first index
The last point is, of course, where people have a problem. But if you take C++ concepts seriously then the accessor is a perfect model of a pointer and a perfect model of an array. These two roles are nonoverlapping and they have nonoverlapping sets of interfaces, which I exploit here.
If you have the expectation that *x and x[0] should do the same thing, it is so because of C. C has no extensive type system like C++ and does not distinguish between arrays and pointers, although these are very different concepts. A pointer points to a value, and an array is a collection of values. "dereferencing" a collection of values makes no sense, we are just used to it because of our C heritage.
C++ has a better type system, and better classes for pointers and arrays than raw pointers. The stdlib authors recognize that *x has no meaning when x is a sequence of values. *x fails when x is a std::vector, std::deque, std::list, or any kind of collection in the stdlib. Even for std::unique_ptr, they made sure that the interfaces for the pointertoobject and pointertoarray specializations behave differently:
```
#include <memory>
int main() {
std::unique_ptr<int> p(new int); // pointer version
// p[0]; // fails
*p; // OK
std::unique_ptr<int[]> a(new int[3]); // array version
// *a; // fails
a[0]; // OK
}
```
Once you accept that the two concepts of pointers and arrays have nonoverlapping interfaces in C++, it becomes possible to make a combined object which has both interfaces, and uses these two sets to return different information.
If you have read this far, I hope your initial reaction of "woah, this looks really inconsistent and confusing!" turned to "hmm, maybe it is not inconsistent after all".
I am looking forward to hear your thoughts.
Best regards,
Hans
PS: The alternative would be to return a std::pair<index_type, value_reference_type>, but this has disadvantages. Unpacking the pair is going be nice in C++17 with structured bindings, but not so much in C++14. Also, it prevents me from adding convenience methods, like the abovementioned bin(N) and density() methods.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk