Would you be so kind to give me an advice how to measure scalability, detect bottlenecks and in general profile the above mentioned parallel MPI application? My next goal is to generate Gantt chart, but the raw profiling data are perfect. Do you use any tools to verify the bugless execution of your parallel applications, since parallel programs are full with Heisenbugs? Currently only BOOST_ASSERTS are placed around both sending and receiving vectors of user-defined classes over MPI.
I look forward to your reply.
Faithfully yours,