Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] tuple benchmarks show marked differences from std::tuple(was Re: Interesting article on stack-based TMP
From: Larry Evans (cppljevans_at_[hidden])
Date: 2012-10-25 12:58:33

Next message: Vicente J. Botet Escriba: "Re: [boost] [thread] terminating destructor"
Previous message: Olaf van der Spek: "Re: [boost] [Boost-build] another simple boost build failure"
In reply to: Eric Niebler: "Re: [boost] tuple benchmarks show marked differences from std::tuple(was Re: Interesting article on stack-based TMP"
Next in thread: Larry Evans: "Re: [boost] tuple benchmarks show marked differences from std::tuple(was Re: Interesting article on stack-based TMP"
Reply: Larry Evans: "Re: [boost] tuple benchmarks show marked differences from std::tuple(was Re: Interesting article on stack-based TMP"
Reply: Jeffrey Lee Hellrung, Jr.: "Re: [boost] tuple benchmarks show marked differences from std::tuple(was Re: Interesting article on stack-based TMP"
Reply: Larry Evans: "Re: [boost] tuple benchmarks show marked differences from std::tuple(was Re: Interesting article on stack-based TMP"

On 10/24/12 14:09, Eric Niebler wrote:
[snip]
> I presented at BoostCon my own benchmarks of tuple with and without
> preprocessing. The results were unambiguously and strongly in favor of
> unrolling with the preprocessor. Tested with gcc. The presentation is
here:
>
>
https://github.com/boostcon/cppnow_presentations_2012/blob/master/mon/trouble_with_tuples.pptx

Thanks. I took a look at it with:

http://www.viewdocsonline.com/document/

and saw the comparison chart on slide 12. That chart, as you say above,
unambiguously
shows favorably on the unrolled tuple.

>
> The source code is here:
>
> https://github.com/ericniebler/home/tree/master/src/tuple
>

I downloaded that and AFAICT:

* The preprocessor method is in:

unrolled_tuple.hpp

and is roughly the same as the vertical tuple implementation here:

http://svn.boost.org/svn/boost/sandbox/variadic_templates/sandbox/slim/test/tuple_impl_vertical.hpp

The main difference, AFAICT, is that unrolled uses aggregation
(via the member declaration:

tuple<Tail...> tail;

on line 133. In contrast, the vertical tuple uses inheritance:

struct tuple_impl<Index, BOOST_PP_ENUM_PARAMS(TUPLE_CHUNK,
TUPLE_IMPL_TYPE_NAME), Others...>
: tuple_impl<Index+TUPLE_CHUNK, Others...>

as shown on line 42 of the .hpp file.

I'm still trying to understand how the get works. What's puzzling
to me is:

        template<typename Tuple, int I>
        static inline constexpr auto get_elem(Tuple &&that, int_<I>)
        RETURN(
            impl<I-I>::get_elem(static_cast<Tuple &&>(that).tail,
int_<I-UNROLL_MAX>())
        )

    since impl<I-I> has got to be 0, why use I-I? Also, the impl
    template parameter, J, is not used anywhere. I'm sure I could
    figure the reason out eventually, but not yet :(. I brief
    explanation would help. Also, it's not obvious to me why:

static_cast<Tuple &&>(that)

is needed because that has been declared as Tuple &&.

I've no idea what are the pros and cons of the two
methods(unrolled vs vertical).

* The variadic template method is in:

tuple.cpp

which is close to that here:

http://svn.boost.org/svn/boost/sandbox/variadic_templates/sandbox/slim/test/tuple_impl_horizontal.hpp

    in that both methods use multiple inheritance with an int key type
    paired with the tuple element type. In the case of tuple.cpp, the
    pairing is done with:

template<int I, typename T>
struct tuple_elem

in tuple_impl_horizontal, pairing is done with:

      template<typename Key, typename Value>
      struct element
      ;
      template<int Key, typename Value>
      struct element<int_key<Key>,Value>
      {
          Value value;
      };

The get functions are essentially the same.

  After looking at the code (and Makefile) it's not clear how the
  benchmark was done. The Makefile has nothing about timing in it,
  and the readme.txt mentions nothing about timing. Looking at the
  tuple.cpp code shows something with tree_builder in it, which sounds
  like it might be the benchmark code; however, so does
  unrolled_tuple.cpp. So, what is the benchmark used to produce the
  chart on slide 12 of trouble_with_tuples.pptx?

>> I thought it also interesting that clang seems to do better than gcc,
>> as reported here:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54710#c10
>
> Interesting. I didn't test with clang.
>

I'll try testing your benchmark, if you provide the code, with both
clang and g++ and post the results.

-regards,
Larry

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk