|
Boost Users : |
Subject: Re: [Boost-users] [MPI, serialization] Segmentation fault in heterogeneous cluster
From: Matthias Troyer (troyer_at_[hidden])
Date: 2010-09-22 06:35:55
Hi, again, it is impossible to find a bug if you are unwilling to send code that shows the bug.
Matthias
Sent from my iPad
On Sep 22, 2010, at 12:31, Martin Huenniger <m.huenniger_at_[hidden]> wrote:
> Hi,
>
> because I have never heard of them before. I just used the information provided by the tutorials for the Boost.MPI and boost/serialization libraries.
>
> Maybe I'll try them.
>
> BTW. Somehow the standard communication routines don't work. I tried it again and I got the following error:
>
> terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::mpi::exception> >'
> what(): MPI_Recv: MPI_ERR_TRUNCATE: message truncated
> [ipc858:16349] *** Process received signal ***
> [ipc858:16349] Signal: Aborted (6)
> [ipc858:16349] Signal code: (-6)
> [ipc858:16349] [ 0] /lib/libpthread.so.0 [0x7fb13d04ea80]
> [ipc858:16349] [ 1] /lib/libc.so.6(gsignal+0x35) [0x7fb13cd1eed5]
> [ipc858:16349] [ 2] /lib/libc.so.6(abort+0x183) [0x7fb13cd203f3]
> [ipc858:16349] [ 3] /usr/lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x114) [0x7fb13d7be294]
> [ipc858:16349] [ 4] /usr/lib/libstdc++.so.6 [0x7fb13d7bc696]
> [ipc858:16349] [ 5] /usr/lib/libstdc++.so.6 [0x7fb13d7bc6c3]
> [ipc858:16349] [ 6] /usr/lib/libstdc++.so.6 [0x7fb13d7bc7aa]
> [ipc858:16349] [ 7] ./my_complex(_ZN5boost15throw_exceptionINS_3mpi9exceptionEEEvRKT_+0x1ef) [0x42f98f]
> [ipc858:16349] [ 8] /home/pirx/local/lib/libboost_mpi.so.1.44.0(_ZNK5boost3mpi12communicator4recvEii+0x80) [0x7fb13ec2aba0]
> [ipc858:16349] [ 9] ./my_complex(_ZN2FC6WorkerINS_4BallIdEES2_E8get_workERS2_+0x76) [0x434316]
> [ipc858:16349] [10] ./my_complex(_ZN2FC12My_complexIdE13working_horseERSo+0xbf) [0x43b5af]
> [ipc858:16349] [11] ./my_complex(_ZN2FC12My_complexIdE7computeERSo+0x1d3) [0x43b803]
> [ipc858:16349] [12] ./my_complex(main+0x7e6) [0x427cd6]
> [ipc858:16349] [13] /lib/libc.so.6(__libc_start_main+0xe6) [0x7fb13cd0b1a6]
> [ipc858:16349] [14] ./my_complex(__gxx_personality_v0+0x1b9) [0x426fe9]
> [ipc858:16349] *** End of error message ***
>
> Since my workaround ehibits no errors, I assume that there is someproblem within the communication routines of Boost.MPI.
>
> Cheers,
> Martin
>
> Matthias Troyer wrote:
>> On 21 Sep 2010, at 16:01, Martin Huenniger wrote:
>>> Hi,
>>>
>>> the problem is solved:
>>>
>>> the bug originated from two issues:
>>> 1)
>>> ...
>>>
>>> 2.) This fragment is _bad_:
>>>
>>> ...
>>>
>>> The next problem is the receiving of binary_archives: Its solution is also a bit under the hood
>>>
>>> ....
>> Why don't you use the packed MPI archives that should avoid all those issues?
>> Matthias
>
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net