|
Boost Users : |
Subject: Re: [Boost-users] [mpi] failed test and other issues
From: Nick Collier (nick.collier_at_[hidden])
Date: 2009-06-12 14:12:55
I've tested a bit more and it looks like this error only occurs when
using a process count > 3. That is, mpirun -np 2 or 3 works fine, but
with > 4 the error below occurs. This makes me think it is a mpi
rather than a serialization issue.
Nick
On Jun 11, 2009, at 5:00 PM, Matthias Troyer wrote:
> Have you tried whether your code van serialize the same vector of
> pointers into a file and deserialize it successfully again? That
> test helps decide whether this is an MPI or a serialization issue.
>
> Matthias
>
> On Jun 11, 2009, at 10:13 AM, Nick Collier <nick.collier_at_[hidden]>
> wrote:
>
>> Hi,
>>
>> I'm using boost-1.39.0 with openMpi 1.3.2 on OSX 10.5.7. I'm
>> getting some strange behavior when sending and receiving a vector
>> of pointers between processes. The received pointer addresses are
>> sometimes invalid. I'm not using the skeleton content code, just
>> sending and receiving the vector. Some output:
>>
>> [2] processing edges from 1
>> [2] unpacking edge 0x834810
>> [2] unpacking edge 0x834810
>> [0] processing edges from 2
>> [0] unpacking edge 0x834ec0
>> [0] unpacking edge 0x834ec0
>> [2] processing edges from 3
>> [2] unpacking edge 0x8346c0
>> [2] unpacking edge 0x8346c0
>> [2] processing edges from 0
>> [2] unpacking edge 0x1
>> [2] unpacking edge 0x1
>>
>> The hex value is the pointer address, and obviously doing anything
>> with the edge with the weird address causes the app to crash. I
>> can't see that I'm doing anything wrong (although of course that's
>> a possibility) and so I thought I'd run the boost mpi tests. The
>> skeleton_content_test fails consistently with the following output:
>>
>> Broadcasting integer list skeleton from root 0...OK.
>> Broadcasting integer list content from root 0...OK.
>> Broadcasting reversed integer list content from root 0...OK.
>> ../../../boost/test/minimal.hpp(123): exception "memory access
>> violation at address: 0x974bfd1c: non-existent physical address"
>> caught in function: 'int main(int, char**)'
>>
>> **** Testing aborted.../../../boost/test/minimal.hpp(123):
>> exception "memory access violation at address: 0x974bfd1c: non-
>> existent physical address" caught in function: 'int main(int,
>> char**)'
>>
>> **** Testing aborted.
>> **** 1 error detected
>> ../../../boost/test/minimal.hpp(123): exception "memory access
>> violation at address: 0x974bfd1c: non-existent physical address"
>> caught in function: 'int main(int, char**)'
>>
>> **** Testing aborted.
>> **** 1 error detected
>> ../../../boost/test/minimal.hpp(123): exception "memory access
>> violation at address: 0x974bfd1c: non-existent physical address"
>> caught in function: 'int main(int, char**)'
>>
>> **** Testing aborted.
>> **** 1 error detected
>> ../../../boost/test/minimal.hpp(123): exception "memory access
>> violation at address: 0x974bfd1c: non-existent physical address"
>> caught in function: 'int main(int, char**)'
>>
>> **** Testing aborted.
>> **** 1 error detected
>>
>> **** 1 error detected
>> ../../../boost/test/minimal.hpp(123): exception "memory access
>> violation at address: 0x974bfd1c: non-existent physical address"
>> caught in function: 'int main(int, char**)'
>>
>> **** Testing aborted.
>> **** 1 error detected
>>
>> I'm not sure if this is related to my problem but the memory issues
>> made me wonder.
>>
>> Any help is appreciated,
>>
>> thanks,
>>
>> Nick
>>
>> _______________________________________________
>> Boost-users mailing list
>> Boost-users_at_[hidden]
>> http://lists.boost.org/mailman/listinfo.cgi/boost-users
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net