Boost logo

Boost Users :

Subject: Re: [Boost-users] [MPI, serialization] Segmentation fault in heterogeneous cluster
From: Francesco Biscani (bluescarni_at_[hidden])
Date: 2010-09-08 05:07:36


Hi Matthias,

On Wed, Sep 8, 2010 at 4:42 AM, Matthias Troyer <troyer_at_[hidden]> wrote:
>
> Can you just send me a program that exhibits the problem?

I can reproduce the error with the minimal program attached in an
earlier message from this thread. Otherwise, the real code is from
this GIT repository in the branch called "mpi":

http://pagmo.git.sourceforge.net/git/gitweb.cgi?p=pagmo/pagmo;a=summary

The relevant code is in src/mpi_environment.cpp and mpi_island.cpp.
The first file implements a class that inits a boost::mpi::environment
and, in case of slave nodes, opens up a "daemon" waiting for jobs to
execute. The class in the second file is in charge of sending the jobs
from the master node to the slaves.

> Also, did you test whether your MPI library works on the heterogeneous machine when making the MPI_* calls and packing data into a buffer using the MPI_Pack/MPI_Unpack calls? There might be a problem with pack/unpack on your system.

Well the problem is there also in homogeneous configuration, both in
local and remote execution. I never used before those MPI calls, but I
tried different setups (e.g., openMPI vs MPICH2, gentoo vs ubuntu, x86
vs ppc64, gcc 4.4 vs 4.5) and all have the same problem. Valgrind
comes out completely clean too :/ I'll see if I can get the hang of
the MPI (un)pack calls.

Cheers,

  Francesco.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net