Boost logo

Boost-MPI :

Subject: [Boost-mpi] boost mpi communicator confusion
From: Ryan Lewis (me_at_[hidden])
Date: 2013-12-05 13:02:02


Hi,

I am having some issues with the boost::mpi::communicator object.

I am trying to use the constructor call for the communicator object which
is supposed to be equivalent to MPI_Comm_Create and I am experiencing
deadlock. The plan is to build an vector of communicators, where if I have
p^2 processors aligned in a p x p grid, I will build a communicator for
every row and column, but, within a row/column with p processors, I will
build p-2 more communicators, corresponding to successively removing the
leftmost processor, for example if I have a 4 x 4 grid, I want to have:
0 1 2 3 <-- one of these for every row/column
   1 2 3 <--
      2 3 <-- these sub communicators, of course ignoring trivial ones.

Originally i just had each processor building all of the groups it would
belong to,
making sure that all processors build row communicators before column
communicators, and that all processors in the same row/col built there
communicators in the same order. This results in a strange deadlock. All
processors enter the the correct constructor call, for example, processors
2,5,8 representing one column would attempt to build a sub communicator of
the WORLD communicator, and they each have a group which lists 2,5,8, in
the same sorted order, but they deadlock and never leave the call.

 When consulting the
manual page: http://www.open-mpi.org/doc/v1.6/man3/MPI_Comm_create.3.php
It is not clear if all processors in WORLD must execute this call or if
only the ones within the group must execute it. When I try to have all
processors execute such a call, I get a segmentation fault on the dtor call
for the invalid communicator.

I'm hoping someone can explain to me how to do this properly.

Best,
-rhl



Boost-Commit list run by troyer at boostpro.com