> seems to indicate that MPI guarantees that sends and recvs are kept ordered on a single-threaded process not using MPI_ANY_SOURCE. If that is the case then boost::mpi should as well.
Right, so they are ordered, and that's the problem.
boost::mpi needs to know exactly the size of data that it's receiving. So, if you If you're sending / receiving a non-native type, boost::mpi needs to transmit how big that data is going to be. Then, it sends the data. So one send becomes two sends to MPI - these are ordered.
Receiving is the opposite - it uses one receive to get the size, and then after it has the size, issues another receive to get the data. If you issue one irecv command before another has gotten its length (and thus issued its data irecv command internally), then because of message ordering, the first irecv will get the length, as expected, but then the second irecv will get the first's data, mistaking it for a length submission.
Hopefully that makes sense. It's an interleaving problem - because everything is ordered, but irecvs turn into two underlying MPI irecvs, the two boost::mpi irecvs interleave, causing the problem.