Boost logo

Boost-MPI :

Subject: Re: [Boost-mpi] multiple irecv tests failure with MPI_ERR_TRUNCATE
From: Walter Woods (woodswalben_at_[hidden])
Date: 2014-02-22 22:39:15


> seems to indicate that MPI guarantees that sends and recvs are kept ordered
on a single-threaded process not using MPI_ANY_SOURCE. If that is the case
then boost::mpi should as well.

Right, so they are ordered, and that's the problem.

boost::mpi needs to know exactly the size of data that it's receiving. So,
if you If you're sending / receiving a non-native type, boost::mpi needs to
transmit how big that data is going to be. Then, it sends the data. So
one send becomes two sends to MPI - these are ordered.

Receiving is the opposite - it uses one receive to get the size, and
then *after
it has the size*, issues another receive to get the data. If you issue one
irecv command before another has gotten its length (and thus issued its
data irecv command internally), then because of message ordering, the first
irecv will get the length, as expected, but then the second irecv will get
the first's data, mistaking it for a length submission.

Hopefully that makes sense. It's an interleaving problem - because
everything is ordered, but irecvs turn into two underlying MPI irecvs, the
two boost::mpi irecvs interleave, causing the problem.

On Fri, Feb 21, 2014 at 5:52 PM, Roy Hashimoto <roy.hashimoto_at_[hidden]>wrote:

> On Fri, Feb 21, 2014 at 11:49 AM, Walter Woods <woodswalben_at_[hidden]>wrote:
>
>> In Roy's case, especially the test file, the problem is having multiple
>> irecv's happening. Lookat the underlying request::handle_serialized_irecv
>> implementation in boost/mpi/communicator.hpp - one recv is accomplished
>> through several MPI_IRecv requests issued in sequence. If you have several
>> irecvs running at once, then one is likely to get the other's data as its
>> length.
>>
>
> Thanks for your reply and looking at the boost::mpi source - I haven't got
> that far. I understand what you're saying, but the first few paragraphs of
> this page:
>
> http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node41.html
>
> seems to indicate that MPI guarantees that sends and recvs are kept
> ordered on a single-threaded process not using MPI_ANY_SOURCE. If that is
> the case then boost::mpi should as well.
>
>
>> In other words, if you want to receive multiple messages in the same tag,
>> be sure to only have one IRecv() with that tag running at a time. Data may
>> only be transferred serially (not in parallel) over a single tag anyhow.
>>
>
> I did change my development code to do this.
>
> Hope that helps,
>>
> Walt
>>
>
> It does, thanks!
>
> Roy
>
> _______________________________________________
> Boost-mpi mailing list
> Boost-mpi_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-mpi
>
>



Boost-Commit list run by troyer at boostpro.com