[mpi] broadcast performance

5 Sep 2012

      Hi all -

I need some advice for using the broadcast function of boost::mpi.  We
have a large buffer (sometimes gigabytes) that we need to get to all
child nodes.  We currently use boost::serialization with a binary
archive to write the data into a std::vector<char>.  Then we send that
data across, and deserialize.  The data is sent with MPI_Bcast.

I've started testing with similar functionality using
boost::mpi::broadcast to handle serialization and deserialization.
Tracing through the code, it seems that the data is sent to all the
child nodes via isend.  Is there something I can do to ensure that
Bcast will be used instead?  If I only have a couple of nodes, the
former is fine, but with more nodes, the MPI implementation of Bcast
may do a better job (logarithmic or even constant time with the number
of nodes).

What are the suggestions for getting fast broadcast in this case?  I
don't think that using skeletons will help, since each instance of the
broadcast will have unique data with potentially different layouts.

Thanks,
  Brian

Brian Budge

Júlio Hoffimann

Brian Budge

Júlio Hoffimann

Brian Budge

Júlio Hoffimann

Brian Budge

tags

participants (2)