|
Boost Users : |
Subject: Re: [Boost-users] [proto] using proto with nvcc
From: Manjunath Kudlur (keveman_at_[hidden])
Date: 2010-02-09 11:29:54
Joel,
> Today with Eric we took a look at what happened. The problem is twofold:
>
> 1/ firs, we need to run the config script so it geenrae a proper
> config/user.hpp file for NVCC
>
> 2/ the BOOST_PROTO_DECLTYPE_ macro:
> The original version is something like:
>
> BOOST_TYPEOF_NESTED_TYPEDEF_TPL(BOOST_PP_CAT(nested_and_hidden_, NESTED),
> EXPR)
> static int const sz =
> sizeof(boost::proto::detail::check_reference(EXPR));
> struct NESTED
> : boost::mpl::if_c<
> 1==sz
> , typename BOOST_PP_CAT(nested_and_hidden_, NESTED)::type &
> , typename BOOST_PP_CAT(nested_and_hidden_, NESTED)::type
> >
> {};
> # define BOOST_PROTO_DECLTYPE_(EXPR, TYPE)
> BOOST_PROTO_DECLTYPE_NESTED_TYPEDEF_TPL_(BOOST_PP_CAT(nested_, TYPE),
> (EXPR))
> typedef typename BOOST_PP_CAT(nested_, TYPE)::type TYPE;
> # endif
>
>
> NVCC chokes on that by sayign that nested_type::type is used w/o template
> qualifiers. I can't make head or tail of thi message BUT.
>
> By changing it to:
>
> #define BOOST_PROTO_DECLTYPE_NESTED_TYPEDEF_TPL_(NESTED, EXPR)
> BOOST_TYPEOF_NESTED_TYPEDEF_TPL(BOOST_PP_CAT(hidden_,NESTED), EXPR)
>
> BOOST_STATIC_CONSTANT(int, sz =
> sizeof(boost::proto::detail::check_reference(EXPR)));
> typedef typename boost::mpl::if_c<
> 1==sz
> , typename BOOST_PP_CAT(hidden_, NESTED)::type &
> , typename BOOST_PP_CAT(hidden_, NESTED)::type
> >::type
> # define BOOST_PROTO_DECLTYPE_(EXPR, TYPE)
> BOOST_PROTO_DECLTYPE_NESTED_TYPEDEF_TPL_(BOOST_PP_CAT(nested_, TYPE),
> (EXPR)) TYPE;
> # endif
>
> ie removing the NESTED struct and computing the return of if_c directly into
> TYPE, NVCC happily compiles proto code in .cu file.
>
> Now, we can compile proto and fusion code in .cu for HOS function (ie code
> executed by the cpu). For making it works in DEVICE function (ie run by the
> GPU itself), it need a substantial amount of work that includes prefixing
> ALL proto, mpl, fusion etc function with __host__ __device__ so they can be
> called inside kernels.
>
> If anyone is interested, the user.hpp for nvcc CUDA is available on request.
> We ponder to do a global BOOST_GPU_ENABLED macro that actually
> evaluates as __host__ __device__ if compiled with nvccc and to nothing
> otherwise and do a local patch for that in w/e boost components it makes
> sense in. Considering the impact and the maybe restricted audience for that,
> not sure it's worth a global support from boost but it'll be here and
> available if needed.
Awesome detective work! BTW, what version of CUDA did you try this
with? I am interested in your modifications to user.hpp, can you
please send it to me?
Thanks,
Manjunath
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net