
Hi Matthias, I updated to Boost 1.44.0 but unfortunately the crash is now even in local mode (mpirun -np 2). The strange thing is that the serialization code is apparently working fine when used with text archives, but with MPI archives the slave process, upon reception, is deserializing the objects with seemingly random values (e.g., huge values instead of 1 or 0 for an integer data member of a structure). I'm trying to isolate the problem right now and, in case I can reproduce it with a minimal example, I will post it here (though it is likely some mistake on my part, it's the first time I use MPI and serialization libraries). Cheers, Francesco On Thu, Sep 2, 2010 at 4:26 AM, Matthias Troyer <troyer@phys.ethz.ch> wrote:
On Sep 2, 2010, at 7:39, Francesco Biscani <bluescarni@gmail.com> wrote:
Hello,
I'm getting a segfault when using Boost.MPI on a cluster of heterogeneous machines (x86_64 and ppc64). The problem arises when the "slave" machine, ppc64, receives its payload from the "master" machine, x86_64, and tries to unpack the archive. Tracing down the issue with valgrind and in debug mode, the problem arises here:
Can this be related to some endianness issue? Is Boost.MPI expected to work on heterogeneous clusters?
Hi Francesco,
Have you checked whether a program using the MPI C API can correctly send data on your heterogeneous cluster? Boost.MPI uses the support for heterogeneous machines of the underlying MPI library unless you define the macro BOOST_MPI_HOMOGENOUS.
Have you also tried the latest Boost release?
Matthias _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users