Boost logo

Boost Users :

Subject: Re: [Boost-users] boostMPI asychronous communication
From: Riccardo Murri (riccardo.murri_at_[hidden])
Date: 2010-06-28 15:28:47


Hello Jack,

On Mon, Jun 28, 2010 at 7:46 PM, Jack Bryan <dtustudy68_at_[hidden]> wrote:
> This is the main part of me code, which may have deadlock.
>
> Master:
> for (iRank = 0; iRank < availableRank ; iRank++)
> {
> destRank = iRank+1;
> for (taski = 1; taski <=  TaskNumPerRank ; taski++)
> {
> resultSourceRank = destRank;
> recvReqs[taskCounterT2] = world.irecv(resultSourceRank, upStreamTaskTag, resultTaskPackageT2[iRank][taskCounterT3]);
> reqs = world.isend(destRank, taskTag, myTaskPackage);
> ++taskCounterT2;
> }
>
> // taskTotalNum = availableRank * TaskNumPerRank
> // right now, availableRank =1, TaskNumPerRank =2
> mpi::wait_all(recvReqs, recvReqs+(taskTotalNum));
> -----------------------------------------------
> worker:
> while (1)
> {
> world.recv(managerRank, downStreamTaskTag, resultTaskPackageW);
> do its local work on received task;
> destRank = masterRank;
> reqs = world.isend(destRank, taskTag, myTaskPackage);
> if (recv end signal)
>   break;
> }

1. I can't see where the outer for-loop in master is closed; is the
wait_all() part of that loop? (I assume it does not.) Can you send a
minimal program that I can feed to a compiler and test? This could
help.

2. Are you sure there is no tag mismatch between master and worker?

  master: world.isend(destRank, taskTag, myTaskPackage);
                                 ^^^^^^^
  worker: world.recv(managerRank, downStreamTaskTag, resultTaskPackageW);
                                   ^^^^^^^^^^^^^^^^^

unless master::taskTag == worker::downStreamTaskTag, the recv() will
wait forever.

Similarly, the following requires that master::upStreamTaskTag ==
worker::taskTag:

  master: ... = world.irecv(resultSourceRank, upStreamTaskTag, ...);
  worker: world.isend(destRank, taskTag, myTaskPackage); //
destRank==masterRank

3. Do the source/destination ranks match? The master waits for messages from
destinations 1..availableRank (inclusive range), and the worker waits
for a message from "masterRank" (is this 0?)

4. Does the master work if you replace the main loop with the following?

    Master:
    for (iRank = 0; iRank < availableRank ; iRank++)
    {
      destRank = iRank+1;
      for (taski = 1; taski <=  TaskNumPerRank ; taski++)
        {
          // XXX: the following code does not contain any reference to
          // "taski": it is sending "TaskNumPerRank" copies of the
          // same message ...
          reqs = world.isend(destRank, taskTag, myTaskPackage);
        };
    }; // I assume the outer loop does *not* include the wait_all()

    // expect a message from each Task
    int n = 0;
    while (n < taskTotalNum) {
      mpi::status status = world.probe();
      world.recv(status.source(), status.tag(),
                 resultTaskPackageT2[status.source()][taskCounterT3]);
      ++n;
    };

Best regards,
Riccardo


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net