Boost logo

Boost :

Subject: Re: [boost] [Fusion] port to c++0x
From: Christopher Schmidt (mr.chr.schmidt_at_[hidden])
Date: 2009-09-26 11:40:56


Larry Evans schrieb:
> There's no type_at in:
>
>
> https://svn.boost.org/trac/boost/browser/sandbox/variadic_templates/libs/mpl/sandbox/tuple_non_recur.package_range_c.cpp
>
>
> There is, however, a tuple<,>::at_type<I> nested struct. Is that what
> you're referring to? If so, I don't understand why that would be
> expensive since neither at_type or what it calls, at_elem, use
> recursion. Could you explain why at_type would be expensive? Or maybe
> you've run some timing tests on it. If so, could you provide the test
> driver so I could test to see why it's so expensive?
>
>
> Now, AFAICT, the number of template instantiations is, neglecting
> those needed for the package_range_c at:
>
> https://svn.boost.org/trac/boost/browser/sandbox/variadic_templates/libs/mpl/sandbox/tuple_non_recur.package_range_c.cpp#L54
>
>
> which (as Doug mentioned near the bottom of his 12 Jan 2009 post) are
> amortized, the number of "overhead" template instantiations is N+C
> where N is the number of elements and C is a small constant. I think
> C would be around 1, i.e. 1 for _Tuple_impl. The N is for the N
> _Tuple_element<Tag,Value> values where Tag is some instance of
> mpl::integral_c<IndexType,IndexValue>. By "overhead" I means the
> number needed besides the N for each element in the tuple and the 1
> for the overall tuple (in the case of vector, that would be
> mpl::vector).
>
> Does that make sense?
>
>
> TIA.
>
> -Larry

I am sorry to have kept you waiting so long. I have been busy with
moving to a new flat.

Consider an instantiation of a tuple with 7 elements, the types foo,
bar, int, char, blub, dub and mup.

My fusion vector will directly evolve to

vector<foo,bar,int,char,blub,dub,mup>
   : detail::vector_impl<0,foo,bar,int,char,blub,dub,mup>
       : detail::vector_impl<4,blub,dub,mup>

As you elaborated, your tuple will evolve to

tuple<char, foo,bar,int,char,blub,dub,mup>
   : _Tuple_impl<
         mpl::package_c<char,0,1,2,3,4,4,5,6,7>,
         foo,bar,int,char,blub,dub,mup
>
       : _Tuple_element<mpl::integral_c<char, 0>, foo>
       , _Tuple_element<mpl::integral_c<char, 1>, bar>
       , _Tuple_element<mpl::integral_c<char, 2>, int>
       , _Tuple_element<mpl::integral_c<char, 3>, char>
       , _Tuple_element<mpl::integral_c<char, 4>, blub>
       , _Tuple_element<mpl::integral_c<char, 5>, dub>
       , _Tuple_element<mpl::integral_c<char, 6>, mup>

Summed up, there are N/4+1 template instantiations for my vector and
N+2+x for your tuple, with x being the amount of instantiations for
mpl::package_range_forward. If mpl::package_range_forward<...> is not
cached, x will be N+C. Also you need to consider the overhead of
template expressions that are not instantiated but still are evaluated.

I think your at_type implementation is expensive as a complete template
function invocation expression is evaluated. This involves a lot of
steps in the lookup process that are not necessary in a meta-only
implementation. Also the lookup process should scale commensurate to N.
I cannot see how this should be faster than a controlled meta-recursion
with a break condition.

On top of that your implementation is limited to compiler with decltype
support. I tried to avoid implementations in which certain c++0x
features depend on others. A user, whose compiler does support variadic
templates but not decltype, should still be able to enjoy the variadic
template candy. This is not much of a problem in our specific case but
in other cases an independent implementation is crucial.

Either way, there are tons of pros and cons for both of our
implementations. I really do like your implementation. It looks and
feels native. Maybe it faster than mine. Maybe it is not. Who knows?
Compiler are too different to let one state performance predictions a
priori.
I judge compile-time performance of an expression by the number steps
(and their orders of growth) involved into evaluating it, because I
think, this is what comes closest to actual real-life performance in
most cases. This led to my conclusion that an unrolled recursive
implementation might be faster.

I attached a test driver to this post. The testcase instantiates 25*25
distinct tuples (either vertical or horizontal ones) with 0 to 10
elements. Then optionally a type (meta-at) and/or object lookup (at) is
done on each element.
I used gcc 4.4.1 (TDM-1 mingw32) and ptime to measure performance. It is
pretty interesting that the instantiation of the horizontal tuple type
is just slightly slower. The at-operations differ all the more surprisingly.

=== g++ -std=c++0x -c test.cpp ===
Execution time: 5.274 s
=== g++ -std=c++0x -c test.cpp -DAT ===
Execution time: 11.821 s
=== g++ -std=c++0x -c test.cpp -DMETA_AT ===
Execution time: 12.161 s
=== g++ -std=c++0x -c test.cpp -DAT -DMETA_AT ===
Execution time: 19.565 s
=== g++ -std=c++0x -c test.cpp -DVERTICAL ===
Execution time: 4.884 s
=== g++ -std=c++0x -c test.cpp -DAT -DVERTICAL ===
Execution time: 7.136 s
=== g++ -std=c++0x -c test.cpp -DMETA_AT -DVERTICAL ===
Execution time: 5.835 s
=== g++ -std=c++0x -c test.cpp -DAT -DMETA_AT -DVERTICAL ===
Execution time: 8.156 s

I hope I made sense.

Larry Evans schrieb:
> Yeah, I saw that 4 a lot of places. I think other libraries use some
> macro to set such a number. For example, BOOST_MPL_LIMIT_UNROLLING
> here:
>
>
https://svn.boost.org/trac/boost/browser/trunk/boost/mpl/aux_/iter_fold_if_impl.hpp#L190

>
>
> Are there plans to add a similar macro to variadic fusion vector?
This is a good idea. I will put it on my TODO-list. Thanks!

By the way, your variadic mpl implementation looks great. Do you have
any plans to integrate your work into the trunk?

-Christopher

template<int>
struct identity_int
{};

#ifdef VERTICAL
template<int Index,typename... Args>
struct tuple_impl;
template<int Index>
struct tuple_impl<Index>
{};
template<int Index,typename H0>
struct tuple_impl<Index,H0>
{
        typedef H0 t0;
        H0 h0;
        H0& get(identity_int<Index+0>){return h0;}
        H0 const& get(identity_int<Index+0>)const{return h0;}
};
template<int Index,typename H0, typename H1>
struct tuple_impl<Index,H0,H1>
{
        typedef H0 t0; typedef H1 t1;
        H0 h0;
        H0& get(identity_int<Index+0>){return h0;}
        H0 const& get(identity_int<Index+0>)const{return h0;}
        H1 h1;
        H1& get(identity_int<Index+1>){return h1;}
        H1 const& get(identity_int<Index+1>)const{return h1;}
};

template<int Index,typename H0, typename H1, typename H2>
struct tuple_impl<Index,H0,H1,H2>
{
        typedef H0 t0; typedef H1 t1; typedef H2 t2;
        H0 h0;
        H0& get(identity_int<Index+0>){return h0;}
        H0 const& get(identity_int<Index+0>)const{return h0;}
        H1 h1;
        H1& get(identity_int<Index+1>){return h1;}
        H1 const& get(identity_int<Index+1>)const{return h1;}
        H2 h2;
        H2& get(identity_int<Index+2>){return h2;}
        H2 const& get(identity_int<Index+2>)const{return h2;}
};
template<int Index,typename H0, typename H1, typename H2, typename H3>
struct tuple_impl<Index,H0,H1,H2, H3>
{
        typedef H0 t0; typedef H1 t1; typedef H2 t2; typedef H3 t3;
        H0 h0;
        H0& get(identity_int<Index+0>){return h0;}
        H0 const& get(identity_int<Index+0>)const{return h0;}
        H1 h1;
        H1& get(identity_int<Index+1>){return h1;}
        H1 const& get(identity_int<Index+1>)const{return h1;}
        H2 h2;
        H2& get(identity_int<Index+2>){return h2;}
        H2 const& get(identity_int<Index+2>)const{return h2;}
        H3 h3;
        H3& get(identity_int<Index+3>){return h3;}
        H3 const& get(identity_int<Index+3>)const{return h3;}
};
template<int Index,typename H0, typename H1, typename H2, typename H3, typename... Others>
struct tuple_impl<Index,H0,H1,H2, H3,Others...>:
        tuple_impl<Index+4,Others...>
{
        typedef tuple_impl<Index+4,Others...> base;
        typedef H0 t0; typedef H1 t1; typedef H2 t2; typedef H3 t3;
        using base::get;
        H0 h0;
        H0& get(identity_int<Index+0>){return h0;}
        H0 const& get(identity_int<Index+0>)const{return h0;}
        H1 h1;
        H1& get(identity_int<Index+1>){return h1;}
        H1 const& get(identity_int<Index+1>)const{return h1;}
        H2 h2;
        H2& get(identity_int<Index+2>){return h2;}
        H2 const& get(identity_int<Index+2>)const{return h2;}
        H3 h3;
        H3& get(identity_int<Index+3>){return h3;}
        H3 const& get(identity_int<Index+3>)const{return h3;}
};
template<typename... Args>
struct tuple:
        tuple_impl<0,Args...>
{};

template<typename Tuple, int Index>
struct meta_value_at:
        meta_value_at<typename Tuple::base, Index-4>
{};
template<typename Tuple>
struct meta_value_at<Tuple,0>
{
        typedef typename Tuple::t0 type;
};
template<typename Tuple>
struct meta_value_at<Tuple,1>
{
        typedef typename Tuple::t1 type;
};
template<typename Tuple>
struct meta_value_at<Tuple,2>
{
        typedef typename Tuple::t2 type;
};
template<typename Tuple>
struct meta_value_at<Tuple,3>
{
        typedef typename Tuple::t3 type;
};
#else
template<int Max,int... Args>
struct make_package:
        make_package<Max-1,Max-1,Args...>
{};
template<int...>
struct package;
template<int... Args>
struct make_package<0,Args...>
{
        typedef package<Args...> type;
};

template<typename, typename Arg>
struct element{Arg a;};
template<typename Keys, typename... Args>
struct tuple_impl;
template<int... Indices, typename... Args>
struct tuple_impl<package<Indices...>, Args...>:
        element<identity_int<Indices>, Args>...
{};

template<typename... Args>
struct tuple:
        tuple_impl<typename make_package<sizeof...(Args)>::type, Args...>
{
        template<int I, typename T>
        static T& at_elem(element<identity_int<I>,T>& a){return a.a;}
        template<int I, typename T>
        static T const& at_elem(element<identity_int<I>,T>const& a){return a.a;}

        template<int I, typename T>
        static T at_type_helper(element<identity_int<I>,T>& a);
        template<int I>
        struct at_type
        {
                typedef decltype(at_type_helper<I>(*reinterpret_cast<tuple*>(0))) type;
        };
};
#endif

template<int I,int J, int Max, typename... Args>
struct get_tuple:
        get_tuple<I,J,Max-1,identity_int<I*1000+J*10+Max>,Args...>
{};
template<int I,int J, typename... Args>
struct get_tuple<I,J,0, Args...>
{
        typedef tuple<Args...> type;
};

template<int I,int J=0>
struct test_impl
{
        typedef typename get_tuple<I, J, J%10>::type tuple_type;

        template<int K>
        static void at_test(tuple_type& t,identity_int<K>)
        {
#ifdef META_AT
# ifdef VERTICAL
                typename meta_value_at<tuple_type,K>::type();
# else
                typename tuple_type::template at_type<K>::type();
# endif
#endif
#ifdef AT
# ifdef VERTICAL
                t.get(identity_int<K>());
# else
                tuple_type::template at_elem<K>(t);
# endif
#endif

                at_test(t,identity_int<K+1>());
        }
        static void at_test(tuple_type&,identity_int<J%10>){}

        static void exec()
        {
                tuple_type t;
                at_test(t,identity_int<0>());
                test_impl<I,J+1>::exec();
        }
};
template<int I>
struct test_impl<I,25>
{
        static void exec(){}
};

template<int I=0>
struct test
{
        static void exec()
        {
                test_impl<I>::exec();
                test<I+1>::exec();
        }
};
template<>
struct test<25>
{
        static void exec(){}
};

int main()
{
        test<>::exec();
}


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk