Boost logo

Boost :

Subject: Re: [boost] [histogram] should some_axis::size() return unsigned or int?
From: Hans Dembinski (hans.dembinski_at_[hidden])
Date: 2018-12-03 14:41:08


> On 3. Dec 2018, at 03:15, Jorg Brown via Boost <boost_at_[hidden]> wrote:
>
> = - = - = - = - =
>
> Before going down the road of size() and ssize(), the real question is what
> should this do:
>
> for (const auto &el: some_axis) {
> // Iterates over the underflow/overflow bins?
> }

It should skip underflow/overflow.

> for (size_t i = 0; i != some_axis.size(); ++i) {
> auto &el = some_axis[i];
> // Iterates over the underflow/overflow bins?
> }

It should skip underflow/overflow and not produce a warning.

> It looks like you wanted to use an index of -1 and size() to refer to
> underflow and overflow. That's... interesting... and I mean "interesting"
> in a good sense of the word.

It is really the most intuitive for a naive person with a math background, but without a strong C++ background. From the math perspective, an axis is an arrow over the real line. Here is my picture again.

Value arrow for axis::regular<>(3, 1, 4): // 3 bins from 1 to 4

             1 2 3 4
-inf ——————— | ————— | ————— | ————— | —————————> +inf
     bin -1 bin 0 bin 1 bin 2 bin 3

Both the values and the indices are ordered and have a natural correspondence. If the bin 0 is the interval [1, 2), then clearly the bin -1 should be the interval [-inf, 1), and similar for the overflow bin. The bins from index 0 to 2 are the "inner bins".

Indexing over inner bins should not behave differently for an axis with and without *-flow bins. Therefore, the index of the first inner bin must be 0 and the last must be size() - 1, always. I could place the *-flow bins after the normal bins, at size() and size() + 1, but this breaks the nice correspondence between the ordering of indices and value intervals. I do not want to give that up easily.

> What this means is that someone who wants to iterate the *-flow bins might
> write:
> […]
> for (int i = -1; i < some_axis.size() + 1; ++i) {

The goal is to make such code work correctly.

> with an ssize() routine, this does work:
>
> […]
> for (size_t i = -1; i < some_axis.ssize() + 1; ++i) {

In this case, you would correctly get a warning from the compiler about initialising a value of type size_t with -1. I don't mind that this produces a warning, it should.

The goal is to make these two lines work correctly and give no warning:

for (unsigned i = 0; i < some_axis.size(); ++i) // …

for (int i = -1; i < some_axis.size() + 1; ++i) // …

I still think that most of the suggestions here try to work around a problem that we have with builtin integers, namely that the code `-1 < 1u` is not doing what you would naively expect.

So an elaborate solution is to make a new integer type `axis::size_type` for axis objects, which wraps an `int` and is implicitly convertible to `int`, but implements comparison operators with `int` and `unsigned` types to yield the naive results without warnings:

Method signature:
axis::size_type my_axis::size() const

Behaviour of axis::size_type:
axis::size_type s = 0;
assert((s-1) == -1); // ok, uses custom operator-(axis::size_type, int)
assert(-1 < s); // ok, uses custom bool operator<(int, axis::size_type)
assert(s < 1u); // no warning, uses custom operator<(axis::size_type, unsigned)

The only "problem" that I can see with this solution is that I have to implement a lot of operators and write a lot of additional tests, etc. But it is most direct solution to the problem that I can see.

Best regards,
Hans


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk