
Just an update, in case anyone is still following this. It turns out that even when serializing the classes to a text archive, converting it to string, transmit the string via boost::mpi and then rebuilding the classes on the other side from the transmitted string, I still have the same error as reported above for heterogeneous clusters (in homogeneous clusters it works seemingly ok). So what I'm doing now is to send the archive in string form using directly the MPI_* primitives (using a std::vector<char> as buffer and MPI_CHAR datatype). This works in all configurations I've tested. I'm not entirely sure if the problem is on my side or if this is a genuine bug, but I would like to provide any info/testing necessary to solve this issue. Thanks again, Francesco. On Fri, Sep 3, 2010 at 6:36 PM, Francesco Biscani <bluescarni@gmail.com> wrote:
Hi Matthias,
probably I'm doing something really stupid, but it seems the problem is somehow related to shared_ptr. This code reproduces the "MPI message truncated error":
#include <boost/mpi/environment.hpp> #include <boost/mpi/communicator.hpp> #include <boost/serialization/assume_abstract.hpp> #include <boost/serialization/export.hpp> #include <boost/serialization/base_object.hpp> #include <boost/serialization/shared_ptr.hpp> #include <boost/serialization/tracking.hpp> #include <boost/serialization/vector.hpp> #include <boost/shared_ptr.hpp> #include <vector>
struct base { virtual void do_something() const = 0; template <class Archive> void serialize(Archive &ar, const unsigned int) { ar & values; } std::vector<double> values; virtual ~base() {} };
BOOST_SERIALIZATION_ASSUME_ABSTRACT(base);
struct derived: public base { void do_something() const {}; template <class Archive> void serialize(Archive &ar, const unsigned int) { ar & boost::serialization::base_object<base>(*this); } };
BOOST_CLASS_EXPORT(derived);
struct container { template <class Archive> void serialize(Archive &ar, const unsigned int) { ar & ptr; } boost::shared_ptr<base> ptr; };
int main() { boost::mpi::environment env; boost::mpi::communicator world; if (world.rank() == 0) { boost::shared_ptr<container> c(new container()); world.send(1,0,c); world.recv(1,0,c); } else { boost::shared_ptr<container> c(new container()); world.recv(0,0,c); world.send(0,0,c); } return 0; }
The error happens when rank 1 is receiving the object:
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::mpi::exception>
' what(): MPI_Unpack: MPI_ERR_TRUNCATE: message truncated
Thanks,
Francesco.