|
Boost Users : |
Subject: Re: [Boost-users] [proto] using proto with nvcc
From: Joel Falcou (joel.falcou_at_[hidden])
Date: 2010-02-09 06:43:50
> I've never tested proto with nvcc, so I'm not surprised it doesn't work.
> I would need someone familiar with the quirks of this compiler (Troy?)
> to help get proto working with it. And there's no guarantee it's
> possible -- proto requires a fairly compliant compiler.
>
Today with Eric we took a look at what happened. The problem is twofold:
1/ firs, we need to run the config script so it geenrae a proper
config/user.hpp file for NVCC
2/ the BOOST_PROTO_DECLTYPE_ macro:
The original version is something like:
BOOST_TYPEOF_NESTED_TYPEDEF_TPL(BOOST_PP_CAT(nested_and_hidden_,
NESTED), EXPR)
static int const sz =
sizeof(boost::proto::detail::check_reference(EXPR));
struct NESTED
: boost::mpl::if_c<
1==sz
, typename BOOST_PP_CAT(nested_and_hidden_, NESTED)::type &
, typename BOOST_PP_CAT(nested_and_hidden_, NESTED)::type
>
{};
# define BOOST_PROTO_DECLTYPE_(EXPR, TYPE)
BOOST_PROTO_DECLTYPE_NESTED_TYPEDEF_TPL_(BOOST_PP_CAT(nested_,
TYPE), (EXPR))
typedef typename BOOST_PP_CAT(nested_, TYPE)::type TYPE;
# endif
NVCC chokes on that by sayign that nested_type::type is used w/o
template qualifiers. I can't make head or tail of thi message BUT.
By changing it to:
#define BOOST_PROTO_DECLTYPE_NESTED_TYPEDEF_TPL_(NESTED, EXPR)
BOOST_TYPEOF_NESTED_TYPEDEF_TPL(BOOST_PP_CAT(hidden_,NESTED), EXPR)
BOOST_STATIC_CONSTANT(int, sz =
sizeof(boost::proto::detail::check_reference(EXPR)));
typedef typename boost::mpl::if_c<
1==sz
, typename BOOST_PP_CAT(hidden_, NESTED)::type &
, typename BOOST_PP_CAT(hidden_, NESTED)::type
>::type
# define BOOST_PROTO_DECLTYPE_(EXPR, TYPE)
BOOST_PROTO_DECLTYPE_NESTED_TYPEDEF_TPL_(BOOST_PP_CAT(nested_,
TYPE), (EXPR)) TYPE;
# endif
ie removing the NESTED struct and computing the return of if_c directly
into TYPE, NVCC happily compiles proto code in .cu file.
Now, we can compile proto and fusion code in .cu for HOS function (ie
code executed by the cpu). For making it works in DEVICE function (ie
run by the GPU itself), it need a substantial amount of work that
includes prefixing ALL proto, mpl, fusion etc function with __host__
__device__ so they can be called inside kernels.
If anyone is interested, the user.hpp for nvcc CUDA is available on
request. We ponder to do a global BOOST_GPU_ENABLED macro that actually
evaluates as __host__ __device__ if compiled with nvccc and to nothing
otherwise and do a local patch for that in w/e boost components it makes
sense in. Considering the impact and the maybe restricted audience for
that, not sure it's worth a global support from boost but it'll be here
and available if needed.
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net