Boost logo

Boost Users :

Subject: Re: [Boost-users] [MPI, serialization] Segmentation fault in heterogeneous cluster
From: Martin Huenniger (m.huenniger_at_[hidden])
Date: 2010-09-22 06:31:51


Hi,

because I have never heard of them before. I just used the information
provided by the tutorials for the Boost.MPI and boost/serialization
libraries.

Maybe I'll try them.

BTW. Somehow the standard communication routines don't work. I tried it
again and I got the following error:

terminate called after throwing an instance of
'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::mpi::exception>
>'
   what(): MPI_Recv: MPI_ERR_TRUNCATE: message truncated
[ipc858:16349] *** Process received signal ***
[ipc858:16349] Signal: Aborted (6)
[ipc858:16349] Signal code: (-6)
[ipc858:16349] [ 0] /lib/libpthread.so.0 [0x7fb13d04ea80]
[ipc858:16349] [ 1] /lib/libc.so.6(gsignal+0x35) [0x7fb13cd1eed5]
[ipc858:16349] [ 2] /lib/libc.so.6(abort+0x183) [0x7fb13cd203f3]
[ipc858:16349] [ 3]
/usr/lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x114)
[0x7fb13d7be294]
[ipc858:16349] [ 4] /usr/lib/libstdc++.so.6 [0x7fb13d7bc696]
[ipc858:16349] [ 5] /usr/lib/libstdc++.so.6 [0x7fb13d7bc6c3]
[ipc858:16349] [ 6] /usr/lib/libstdc++.so.6 [0x7fb13d7bc7aa]
[ipc858:16349] [ 7]
./my_complex(_ZN5boost15throw_exceptionINS_3mpi9exceptionEEEvRKT_+0x1ef)
[0x42f98f]
[ipc858:16349] [ 8]
/home/pirx/local/lib/libboost_mpi.so.1.44.0(_ZNK5boost3mpi12communicator4recvEii+0x80)
[0x7fb13ec2aba0]
[ipc858:16349] [ 9]
./my_complex(_ZN2FC6WorkerINS_4BallIdEES2_E8get_workERS2_+0x76) [0x434316]
[ipc858:16349] [10]
./my_complex(_ZN2FC12My_complexIdE13working_horseERSo+0xbf) [0x43b5af]
[ipc858:16349] [11]
./my_complex(_ZN2FC12My_complexIdE7computeERSo+0x1d3) [0x43b803]
[ipc858:16349] [12] ./my_complex(main+0x7e6) [0x427cd6]
[ipc858:16349] [13] /lib/libc.so.6(__libc_start_main+0xe6) [0x7fb13cd0b1a6]
[ipc858:16349] [14] ./my_complex(__gxx_personality_v0+0x1b9) [0x426fe9]
[ipc858:16349] *** End of error message ***

Since my workaround ehibits no errors, I assume that there is
someproblem within the communication routines of Boost.MPI.

Cheers,
Martin

Matthias Troyer wrote:
> On 21 Sep 2010, at 16:01, Martin Huenniger wrote:
>
>> Hi,
>>
>> the problem is solved:
>>
>> the bug originated from two issues:
>> 1)
>> ...
>>
>> 2.) This fragment is _bad_:
>>
>> ...
>>
>> The next problem is the receiving of binary_archives: Its solution is also a bit under the hood
>>
>> ....
>
>
> Why don't you use the packed MPI archives that should avoid all those issues?
>
> Matthias


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net