Subject: [Boost-mpi] boost mpi communicator confusion
From: Ryan Lewis (me_at_[hidden])
Date: 2013-12-05 13:02:02
I am having some issues with the boost::mpi::communicator object.
I am trying to use the constructor call for the communicator object which
is supposed to be equivalent to MPI_Comm_Create and I am experiencing
deadlock. The plan is to build an vector of communicators, where if I have
p^2 processors aligned in a p x p grid, I will build a communicator for
every row and column, but, within a row/column with p processors, I will
build p-2 more communicators, corresponding to successively removing the
leftmost processor, for example if I have a 4 x 4 grid, I want to have:
0 1 2 3 <-- one of these for every row/column
1 2 3 <--
2 3 <-- these sub communicators, of course ignoring trivial ones.
Originally i just had each processor building all of the groups it would
making sure that all processors build row communicators before column
communicators, and that all processors in the same row/col built there
communicators in the same order. This results in a strange deadlock. All
processors enter the the correct constructor call, for example, processors
2,5,8 representing one column would attempt to build a sub communicator of
the WORLD communicator, and they each have a group which lists 2,5,8, in
the same sorted order, but they deadlock and never leave the call.
When consulting the
manual page: http://www.open-mpi.org/doc/v1.6/man3/MPI_Comm_create.3.php
It is not clear if all processors in WORLD must execute this call or if
only the ones within the group must execute it. When I try to have all
processors execute such a call, I get a segmentation fault on the dtor call
for the invalid communicator.
I'm hoping someone can explain to me how to do this properly.
Boost-Commit list run by troyer at boostpro.com