Boost logo

Boost Users :

Subject: Re: [Boost-users] Performance optimization in Boost using std::vector<>
From: Ilja Honkonen (ilja.honkonen_at_[hidden])
Date: 2015-02-12 10:51:59


Hello
> There is a known performance problem with serializing a std::vector
> over MPI.
> Basically, this prevents you from ever reaching the performance of C.
> The problem is on the receive side. When you receive a vector, if you
> don't know the size,
> the receive side has to:
> - get the number of elements of the vector
> - resize the vector (which initializes elements)
> - receive the elements in the vector data (reinitialize the elements)
> The C version of the idiom:
> - gets the number of elements
> - reserves (as opposed to resize) the memory for the elements
> - receive the element in the vector (initialize elements once).
> This might make a small or a large performance difference, profile!

According to the attached program there seems to be a much larger
performance problem than initializing vector elements. The program first
sends a vector of doubles using MPI, then sends another identical vector
with boost::mpi and prints how long these took in seconds. Note that
boost::mpi also sends two messages for run-time sized containers. For
vectors of 1e6 items the program prints (mpi rank is the first number):

mpi
0 resize: 0.0126891, send: 0.00988925, recv: 0
1 resize: 0.0131643, send: 0, recv: 0.00955247
boost::mpi
0 resize: 0.0096425, send: 0.279135, recv: 0
1 resize: 0, send: 0, recv: 0.295702

For vectors of 1e7 items:

mpi
0 resize: 0.0974027, send: 0.0538886, recv: 0
1 resize: 0.105708, send: 0, recv: 0.0456324
boost::mpi
0 resize: 0.0517177, send: 2.70333, recv: 0
1 resize: 0, send: 0, recv: 2.82339

And vectors of 5e7 items:

mpi
0 resize: 0.590099, send: 0.226269, recv: 0
1 resize: 0.440719, send: 0, recv: 0.375706
boost::mpi
0 resize: 0.198448, send: 13.5335, recv: 0
1 resize: 0, send: 0, recv: 14.0518

Boost::mpi version is always at least 10 times slower. It also seems to
run out of memory with smaller number of items implying that unnecessary
copies of data are created somewhere. Based on experience with more
complex programs (e.g. http://dx.doi.org/10.1016/j.jastp.2014.08.012) I
wouldn't recommend boost::mpi for high performance computing. Or in case
of user error at least high performance is easier to get with pure MPI...

I used boost-1.57.0, g++ (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7) and
mpirun (Open MPI) 1.6.5.

Ilja




Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net