|
Boost Users : |
From: Matthias Troyer (troyer_at_[hidden])
Date: 2008-02-10 12:47:50
On 10 Feb 2008, at 02:21, Aydin Buluc wrote:
> Hi all,
> I have been using the serialization library since it is the way to
> go for communicating whole classes in Boost.MPI.
> Previously, I didn't have any problems.
>
> However, I now experience a strange problem. I have successfully ran
> all my code in SDSC's Teragrid cluster (A cluster of Itanium
> processors, running intel-linux), but when I tried to do the same
> thing in NCSA's cluster (which is similarly composed of Itaniums
> with same OS) I just can't build serialization library successfully.
> It says:
>
> intel-linux.link.dll bin.v2/libs/serialization/build/intel-linux/
> release/libboost_serialization-il-1_35.so.1.35.0
>
> OBJREAD Error: Could not create mapping for "bin.v2/libs/
> serialization/build/intel-linux/release/basic_oarchive.o".
>
> icpc: error: problem during multi-file optimization compilation
> (code 1)
>
>
> Both clusters use the same kernel (2.4.21), same compiler suite (icc
> 9.1), same architecture, same mpixx (/usr/local/apps/mpich-
> gm-1.2.6..14b-intel-r2/bin/mpicxx).
> So, that was a little frustrating. But the fact is that, it creates
> the multithreaded library (libboost_serialization-il-mt-1_35) even
> after the error. I gave it a try. Compilation is flawless, but
> during runtime, main() is even not called [it seg faults
> immediately].When I debug it, I get the following backtrace:
>
> This GDB was configured as "ia64-suse-linux"...
> (gdb) run
> Starting program: /home/ac/aydinb/ParSPGEMM/testpar
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x6000000000011ef0 in typeinfo for
> boost::serialization::detail::extended_type_info_typeid_0 ()
> (gdb) bt
> #0 0x6000000000011ef0 in typeinfo for
> boost::serialization::detail::extended_type_info_typeid_0 ()
> #1 0x4000000000103610 in
> boost
> ::serialization
> ::detail
> ::extended_type_info_typeid_0
> ::less_than(boost::serialization::extended_type_info const&) const
> (this=0x6000000000018650,
> rhs=@0x6000000000018910) at libs/serialization/src/
> extended_type_info_typeid.cpp:22
> (gdb)
>
> I said fine. Maybe the library was broken due to errors in make /
> make install. Since we have the same settings, I copied the
> libraries from SDSC (which works perfectly). Same error, same lines
> from the gdb :(
>
> Any insight why this might be the case?
Have you tried to see whether the serialization library works on those
machines. Can you try to write the data types you want to send via MPI
into a binary archive and see whether a similar problem appears? This
looks like a problem with the serialization library.
Matthias
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net