Boost logo

Boost :

From: Allen Sanderson (allen_at_[hidden])
Date: 2024-05-29 22:27:11


Hello,

I am trying to sort out an odd crash on exit while using NVCC (via Kokkos-CUDA) and the Boost Unit Test Framework. Here my are data points:

Kokkos-CUDA with the Boost test tool infrastructure - Crash
Kokkos-HIP with the Boost test tool infrastructure - works
Kokkos-Serial with the Boost test tool infrastructure - works
Pure CUDA with the Boost test tool infrastructure - works
Serial with the Boost test tool infrastructure - works
Kokkos-CUDA alone (e.g. normal application) - works

For the first case I have three scenarios (only used to high the Kokkos usage - the code does not crash in Kokkos::initialize), two of which crash:

if(true) Kokkos::initialize(argc,argv);
 crash in CUDA - SEGFAULTif(false) Kokkos::initialize(argc,argv);
 crash in elsewhere - Subprocess aborted#if 0
Kokkos::initialize()
#endif
 No crash

For the first scenario the call stack is the following:

#0 0x00001555521b5d9a in ?? () from ./cuda/12.2.0/lib64/libcudart.so.12
#1 0x00001555521b8c14 in ?? () from ./cuda/12.2.0/lib64/libcudart.so.12
#2 0x000015555219d882 in ?? () from ./cuda/12.2.0/lib64/libcudart.so.12
#3 0x00001555521a072b in ?? () from ./cuda/12.2.0/lib64/libcudart.so.12
#4 0x000015554b3b5797 in __cxa_finalize () from /lib64/libc.so.6
#5 0x0000155551301a87 in __do_global_dtors_aux () from build_kokkos_cuda/library/Operators/libOperators-g.so.5.5.0
#6 0x00007fffffffad10 in ?? ()
#7 0x000015555532dcee in _dl_fini () at dl-fini.c:141

For second, the call stack is quite deep after the call to __cxa_finalize () otherwise the first four (4-7) are the same.

At this point the only culprit I can perhaps point at is nvcc I know at that previous versions of Boost and nvcc did not get along. And such, wondering if the usage is a corner case because I know of another project that is successfully using all three Boost, Kokkos, and CUDA.

I am using Boost 1.85, Kokkos 4.3, and CUDA 12.2. Any insight or thoughts ? Our unit test work with Kokkos-HIP so able to validate results.

Allen Sanderson
SCI Institute
University of Utah


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk