Boost logo

Boost Users :

Subject: [Boost-users] MPICH2 + Boost.MPI Collective Problems
From: Stephan Hackstedt (stephan.hackstedt_at_[hidden])
Date: 2010-08-20 10:08:24


Hi there,

i have a big problem by running MPI programs which use the Boost.MPI
library. When i'm trying to run programs on *more *than one node, collective
operations like
communicator::barrier<http://boost.org/doc/libs/1_44_0/doc/html/boost/mpi/communicator.html#id918378-bb>or
broadcast,<http://boost.org/doc/libs/1_44_0/doc/html/boost/mpi/broadcast.html>or
even the
environment<http://boost.org/doc/libs/1_44_0/doc/html/boost/mpi/environment.html>destructor
(cause of FINALIZE, which is colletive) causing the programm to
crash. I got errors like this :

*[1]terminate called after throwing an instance of
'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::mpi::exception>
>'

[1] what(): MPI_Barrier: Other MPI error, error stack:
[1]PMPI_Barrier(362).................: MPI_Barrier(MPI_COMM_WORLD) failed
[1]MPIR_Barrier_impl(255)............:
[1]MPIR_Barrier_intra(79)............:
[1]MPIC_Sendrecv(186)................:
[1]MPIC_Wait(534)....................:
[1]MPIDI_CH3I_Progress(184)..........:
[1]MPID_nem_mpich2_blocking_recv(895):
[1]MPID_nem_tcp_connpoll(1746).......: Communication error with rank 0: *

I also tested this with the simple broadcast example from the Boost.MPI
tutorial - same errors..
But when using the original MPI equivalent without the Boost.MPI library,
such as MPI_Barrier<http://www.mpi-forum.org/docs/mpi-11-html/node66.html#Node66>,
the programm runs well. I am using MPICH2 on Ubuntu 10.04 platforms.
Someone had problems like this or know a fix for that?

Regards,

stephan



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net