Boost logo

Boost :

Subject: Re: [boost] [histogram] review part 2 (implementation)
From: Steven Watanabe (watanabesj_at_[hidden])
Date: 2018-09-25 20:45:50


AMDG

On 09/25/2018 01:34 PM, Hans Dembinski wrote:
>
>> 128: if (Archive::is_loading::value) { this->~variable(); }
>> Don't call the destructor like this.
>
> How should I call it then?
>

  Don't. You should only call the destructor
explicitly if you intend to end the object's
lifetime. In particular, it should only be
used for objects that were constructed with
placement new. The code has undefined behavior,
because you are manipulating an object that
does not exist (because it has been destroyed).
Even if you ignore that problem, it's not exception-safe.

>> - serialize for user defined axis types seems
>> error prone, because labelled_base::serialize
>> is inherited. If the user forgets to add serialize,
>> the code will compile and silently slice the object.
>
> How can I avoid this?
>

The only simple solution is: don't use inheritance.
It's also probably possible to use the non-member
version of serialize, somehow.

Note: The same issue appears for other members as well,
but I don't really consider it a problem, because they
aren't optional, and are, therefore, less prone to being
forgotten than serialize.

>> weight.hpp:
>>
>> 12: namespace detail {
>> Be careful about exposing types from namespace detail
>> in any way. It allows ADL in client code to look inside
>> your detail namespace.
>
> So better make weight_type and sample_type public?
>

Or in their own namespace.

>> axis/base.hpp:
>>
>> - I only see operator==, but not operator!=.
>
> Axes types are not required to have operator!=.
>

That doesn't prevent you providing it, and
it is more friendly to do so.

>> - Actually, I find interval_view somewhat unconvincing.
>> Why can't it just be a pure value that stores the upper
>> and lower bounds? The value_type is usually going to
>> be a built-in type anyway, not to mention that you
>> haven't documented interval_view sufficiently for
>> it to be used by user-defined axes.
>
> Let's say an interval is returned as a std::pair<value_type, value_type>. Every time you call axis[k], axis::lower(k) and axis::lower(k+1) are called to fill both values of the pair.
>
> Now assume you want to fill a std::vector with all the bin edges (a realistic use case). You iterate over the axis, get the intervals p, and append p.first to the vector. When you reach the last bin, you also append p.second. All the other p.seconds are filled while iterating over the axis, but not used. For N bins, lower() is called 2 N times while only N+1 times the value is actually used. This is very inefficient.
>
> interval_view was my solution to avoid this superfluous calling of lower(). Is the compiler is smart enough to avoid the superfluous filling of values that are not used? I don't know, maybe. If so then I would also prefer the simpler solution.
>

  It should be possible. I would certainly
assume, unless and until proven otherwise,
that the compiler can optimize it as long
as no reference or copy of the interval_view
escapes from the immediate context.

> Another nice aspect of interval_view: it is explicitly convertible to the bin index.
> This allows you to run a range-based for-loop over an axis and query the histogram directly with these instances
>
> for (auto bin : histogram.axis(0)) {
> const auto value = histogram.at(bin); // only works because "bin" knows index
> }
>

That doesn't strictly require you to maintain
a reference to the axis.

>> - Is there a reason that you're not just using std::vector
>> for variable and category?
>
> Yes, sizeof(detail::buffer<T>) < sizeof(std::vector<T>) and axis types should be as small as possible. We don't need growth, so there no need to store a capacity in addition to a size.
>

Why do the axis types need to be small? Even if you
waste a bit of memory, it should be insignificant
compared to the actual data, right? Also, if you're
that concerned about memory usage, the label costs
at least as much.

>> detail/axes.hpp:
>>
>> 521: optional_index call_impl(Tag, const std::tuple<T>& axes,
>> It seems that this specialization prevents using a single
>> element container with a single element static histogram,
>> which seems inconsistent with the documentation and the
>> behavior for dynamic histograms.
>
> Correct, because this case is ambiguous. Let's say you have a 1d-histogram h1 and 2d-histogram h2. Both are dynamic histograms and you fill them.
>
> h2(1, 2); // ok
> h2(std::make_tuple(1, 2)); // ok, one argument must represent two values
> h1(1); // ok
> h1(std::make_tuple(1)); // ambiguous!
>
> h1 cannot know at compile-time whether it should unpack the tuple or pass it to the axis unchanged. Perhaps you have a weird axis that accepts tuples as value_type.
>
> I had to make a decision here and the most reasonable one is to pass single arguments of 1d histograms unchanged to the axis.
>

Shouldn't the same reasoning apply to dynamic histograms
as well? This definitely needs to be documented. The
documentation says that you can pass a container, without
saying that one argument is an exception.

In Christ,
Steven Watanabe


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk