Boost logo

Boost :

Subject: Re: [boost] [MPL][vector] Is there interest in mpl::vector using variadic templates?
From: Bruno Dutra (brunocodutra_at_[hidden])
Date: 2017-03-03 16:51:26


On Fri, Mar 3, 2017 at 2:46 PM, Larry Evans via Boost <boost_at_[hidden]
> wrote:

> On 03/02/2017 01:54 PM, Bruno Dutra via Boost wrote:
>
>> On Thu, Mar 2, 2017 at 7:42 PM, Larry Evans via Boost <
>> boost_at_[hidden]
>>
>>> wrote:
>>>
>>
>> On 03/02/2017 11:54 AM, Larry Evans via Boost wrote:
>>>
>>> On 03/02/2017 11:27 AM, Peter Dimov via Boost wrote:
>>>>
>>>> Larry Evans wrote:
>>>>>
>>>>> One problem with the above cppljevans mpl is there's no at.hpp.
>>>>>
>>>>>> Instead, the non-variadic boost/mpl/at.hpp was used.
>>>>>> The reason no variadic at was created was because, AFAICT, there was
>>>>>> no non-recursive method for picking the I-th element from T... , and,
>>>>>> IIUC, recursive templates cause compile-time slow downs.
>>>>>>
>>>>>>
>>>>> Have you read
>>>>>
>>>>> http://pdimov.com/cpp2/simple_cxx11_metaprogramming_2.html
>>>>>
>>>>> ?
>>>>>
>>>>>
>>>> Nope. Thanks *very much* for the link. I'm impressed (especially
>>>> with the way you actually cited the parts of the standard to guide
>>>> your search for the best method!).
>>>>
>>>>
>>>> Search for mp_at.
>>>>>
>>>>> Thanks for that tip.
>>>>>
>>>>
>>>> I'm a bit surprised that the mp_map_from_list was fastest.
>>>> I would have thought that large template classes mean slow
>>>> compile times, but I guess not.
>>>>
>>>>
>>> Since you seem to be interested on benchmarking compilation times, check
>> out http://metaben.ch.
>>
>> The at benchmark there indicates mpl.vector is the fastest
> in the, I guess you'd call it, the "performance at given
> number_of_elements pop-up list", when the cursor is moved
> over to the rhs at number_of_elements=500. Hmmm. Now I see
> there's actually no mpl.vector graph at that
> number_of_elements. OOPS. I see that that's because
> mpl.vector was not run for number_of_elements > 50, but
> that's only obvious when the subtract_baseline radio button
> is turned-off. To avoid this confusion, the graph should
> give some indication, in the "performance at given
> number_of_elements pop-up list", which methods were actually
> run at the given number_of_elements.

This is actually a known feature of NVD3, which is the tool we use to plot
data, see: https://github.com/ldionne/metabench/issues/158

I agree it is not very intuitive the way it is, but it is not obvious how
to fix it without hacking NVD3. I appreciate thoughts on that.

> This:
>>>
>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4235.htm
>>>
>>> claims:
>>>
>>> template instantiations use memory that is never freed during the
>>> compilation process
>>>
>>> so, maybe mp_map_from_list should be used with caution. If many
>>> mp_map_from_list instantiations are used, I'm guessing the compiler
>>> might become pressed for memory. Maybe a benchmark showing
>>> compile time vs number of instantiations would show this.
>>> For example, instead of N=800, see what happens when 2
>>> instantiations with N=400 happen, and compare with a similar
>>> test for mp_repeat_c.
>>>
>>
>>
>> This is actually not the case, in fact the exact opposite is true.
>>
>
> So n4235.htm was wrong or maybe I've misinterpreted it?

>
> It turns
>> out mp_map_from_list is actually very cheap to the compiler in terms of
>> memory allocation, because it takes advantage of template memoization, so
>> if you retrieve every single element of a list using this trick, the
>> compiler needs to instantiate it only once. On the other hand, the trick
>> employing void pointers can't be expressed in a way that take advantage of
>> memoization, which means that, for every element retrieved from the list,
>> the entire pack of void pointers must be instantiated again.
>>
>
> So how do you resolve what n4235 is saying and what you say above?
>
> I'm guessing that 4235 was actually referring to this
> memoization when it said:
>
> template instantiations use memory that is never freed
>
> The memoizations do use memory which, IIUC, is never freed
> until the compiler is finished. OTOH, when you say:
>
> > for every element retrieved from the list, the entire pack
> > of void pointers must be instantiated again.
>
> That means more compile time is needed for the repeated
> instantiations; however, no memory is *permanently* used as
> is the case for memoization. IOW, this is a classic case of
> trading memory for speed:
>
> https://en.wikipedia.org/wiki/Space%E2%80%93time_tradeoff
>
> OTOH, my point was that as these memoization's increase, the
> compiler has less memory to use; hence, might slow down;
> whereas, since, the memory used by the vec_of_void* method
> is reused, memory pressure is less, and, although it's
> slower for 1 instance of a pack, it might become faster as
> many packs are used.
> <http://lists.boost.org/mailman/listinfo.cgi/boost>
>

I believe you've misinterpreted n4235 and/or what I was trying to say.

Memoization is not something the programmer can tell the compiler to do or
not, compilers just do it, because that is what they are made for:
instantiate templates and record information about them so that they can be
used elsewhere in the program where they happen to be visible, so it is
true that memory is never freed once a template is instantiated, because in
general it must not.

Now let us take a look at the problem at hand from this point of view.
Using the type-map idea, for a given list of N elements, the compiler
instantiates the *same* type-map of N key-value pairs whenever any element
from this list is retrieved. Since information about this type is memoized,
the compiler doesn't need to repeatedly instantiate it, it can simply look
it up. This means a single extra template instantiation is needed to
retrieve all N elements of a list. OTOH the void* trick requires that a
pack of M void pointers be instantiated when the element at position M is
retrieved from a list, so, for a list of N elements, the compiler must
instantiate N *distinct* packs of void pointers, because each of these
packs have a different size and is thus a distinct type. This means the
compiler does not benefit from memoization at all. Also remember that a
bunch of intermediary instantiations are required in order to generate a
pack of void pointers, even if we use the logarithmic trick, so we end up
forcing the compiler to store a bunch of useless information about all
these dummy types. That is what compromises compilation times when using
the void pointer trick.

Bruno


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk