Boost logo

Boost :

Subject: Re: [boost] [histogram] review part 2 (implementation)
From: Steven Watanabe (watanabesj_at_[hidden])
Date: 2018-09-25 22:42:32


AMDG

On 09/25/2018 03:44 PM, Hans Dembinski wrote:
>> On 25. Sep 2018, at 22:45, Steven Watanabe via Boost <boost_at_[hidden]> wrote:
>>
>> On 09/25/2018 01:34 PM, Hans Dembinski wrote:
>>>
>>>> 128: if (Archive::is_loading::value) { this->~variable(); }
>>>> Don't call the destructor like this.
>>>
>>> How should I call it then?
>>>
>>
>> Don't. You should only call the destructor
>> explicitly if you intend to end the object's
>> lifetime. In particular, it should only be
>> used for objects that were constructed with
>> placement new. The code has undefined behavior,
>> because you are manipulating an object that
>> does not exist (because it has been destroyed).
>> Even if you ignore that problem, it's not exception-safe.
>
> I googled a bit but couldn't find that there is anything special about calling a destructor compared to calling other methods.
> https://stackoverflow.com/questions/14187006/is-calling-destructor-manually-always-a-sign-of-bad-design
>
> I want to deallocate the memory of axis::variable here, so I call the destructor manually, which does that. The alternative is to either repeat the deallocation code in the serialize function (error prone), or move the deallocation code into another method which is then called by the destructor and the serialize function. I don't see the gain unless something special happens when you call destructors.
>
Destructors (along with constructors) are special.

"The lifetime of an object of type T ends when ... the
destructor call starts"
"The properties ascribed to objects throughout this
International Standard apply for a given object only
during its lifetime"
[basic.life] (C++11)

And it gets worse. You're not just deallocating the
buffer. You're also calling the destructors of all
subobjects, including the label. The only thing that
saves you from a double free is that you're using small
labels that fit in the internal buffer in your tests.

>>> Yes, sizeof(detail::buffer<T>) < sizeof(std::vector<T>) and axis types should be as small as possible. We don't need growth, so there no need to store a capacity in addition to a size.
>>>
>>
>> Why do the axis types need to be small? Even if you
>> waste a bit of memory, it should be insignificant
>> compared to the actual data, right? Also, if you're
>> that concerned about memory usage, the label costs
>> at least as much.
>
> The histogram iterators over axis types a lot. If they are small they are more likely to fit onto a cache line. When you use a dynamic histogram, the size of axis::any<…> is size of largest bounded axis type plus some bytes for the type index. The largest bounded axis type determines the size of axis::any, so I was generally careful in making the larger axis types not larger than necessary.
>
> True, labels have a size cost of several pointers, but having labels is more important than reducing the memory footprint, since they are very useful in any non-trivial program with more than one kind of histogram.
>
> I had a special tiny_string implementation at some point with the size of one pointer. It even had small-string-optimisation, you can find it here:
> https://github.com/HDembinski/stateful_pointer
>
> Then I decided that adding a custom string type is overkill and used boost::container::string instead.
>
> It is not overkill to use detail::buffer here instead of std::vector, since detail::buffer is anyway used by adaptive_storage. Replacing it here with std::vector does not make detail::buffer obsolete.
>

My real complaint is not so much overkill as the
difficulty of getting manual memory management right,
which makes the code more error-prone and harder to
understand. I believe that the problems I have
pointed out are a sufficient demonstration of this.

>>>> detail/axes.hpp:
>>>>
>>>> 521: optional_index call_impl(Tag, const std::tuple<T>& axes,
>>>> It seems that this specialization prevents using a single
>>>> element container with a single element static histogram,
>>>> which seems inconsistent with the documentation and the
>>>> behavior for dynamic histograms.
>>>
>>> Correct, because this case is ambiguous. Let's say you have a 1d-histogram h1 and 2d-histogram h2. Both are dynamic histograms and you fill them.
>>>
>>> h2(1, 2); // ok
>>> h2(std::make_tuple(1, 2)); // ok, one argument must represent two values
>>> h1(1); // ok
>>> h1(std::make_tuple(1)); // ambiguous!
>>>
>>> h1 cannot know at compile-time whether it should unpack the tuple or pass it to the axis unchanged. Perhaps you have a weird axis that accepts tuples as value_type.
>>>
>>> I had to make a decision here and the most reasonable one is to pass single arguments of 1d histograms unchanged to the axis.
>>>
>>
>> Shouldn't the same reasoning apply to dynamic histograms
>> as well? This definitely needs to be documented. The
>> documentation says that you can pass a container, without
>> saying that one argument is an exception.
>
> static and dynamic histograms have the exact same behaviour in this regard.

That may be what you intended, but it isn't
what the code actually does. I can pass
a one-element tuple to a dynamic histogram
and it will be unpacked. Similarly, I can
pass a (one element) vector to a (one axis)
dynamic histogram and it will work. The
latter has to be supported as the size cannot
be detected at compile time.

> Agreed, this has to be documented.
>

In Christ,
Steven Watanabe


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk