Boost test and openmpi

Hello, I have the following code: #define BOOST_TEST_DYN_LINK #define BOOST_TEST_MODULE mpi_functionality #include <boost/test/unit_test.hpp> #include "mpi.h" struct MPI_Functionality_Fixture { MPI_Functionality_Fixture() { m_argc = boost::unit_test::framework::master_test_suite().argc; m_argv = boost::unit_test::framework::master_test_suite().argv; } int m_argc; char** m_argv; }; BOOST_FIXTURE_TEST_SUITE( MPI_Functionality_TestSuite, MPI_Functionality_Fixture ) BOOST_AUTO_TEST_CASE( mpi_functionality_parenv_init ) { MPI_Init(&m_argc,&m_argv); } BOOST_AUTO_TEST_CASE( mpi_functionality_parenv_finalize ) { MPI_Finalize(); } BOOST_AUTO_TEST_SUITE_END() The code compiles fine, but I have the following problem when I try to run it. When I run as 'mpirun ./my_executable' or 'mpirun -np 3 ./my_executable' everything goes fine. On the other hand, when I run as ./my_executable I get the following error: unknown location(0): fatal error in "mpi_functionality_parenv_finalize": unrecognized signal Outside of Boost, it is perfectly possible to write a program which calls MPI_Init and MPI_Finalize and it does not have to be used with mpirun. I wonder if this (different) behavior of Boost test is normal. I did not find any remarks regarding Boost test and mpi in the tutorial. Best regards, Martin Vymazal

[Please do not mail me a copy of your followup] boost-users@lists.boost.org spake the secret code <4075154.HSbG0KVyOh@arlin17> thusly:
BOOST_AUTO_TEST_CASE( mpi_functionality_parenv_init ) { MPI_Init(&m_argc,&m_argv); }
BOOST_AUTO_TEST_CASE( mpi_functionality_parenv_finalize ) { MPI_Finalize(); }
These aren't separate test cases, they are setup/teardown for an individual test case. <http://www.open-mpi.org/doc/v1.4/man3/MPI_Init.3.php> and <http://www.open-mpi.org/doc/v1.4/man3/MPI_Finalize.3.php> indicate that these two functions should always be paired. Furthermore, not even MPI_Init can be called after MPI_Finalize. Therefore, you can't call these pairs of functions once per test case or once per fixture, you can only call them once per process. Therefore, you probably want to write your own implementation of main() that handles MPI_Init and MPI_Finalize before calling the test runner. Since you are using the shared library version of Boost.Test, see this page for an example of how to write your own implementation of main. <http://user.xmission.com/~legalize/tmp/boost.test/libs/test/doc/html/test/guide/compilation/shared_library.html> -- "The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline> The Computer Graphics Museum <http://computergraphicsmuseum.org> The Terminals Wiki <http://terminals.classiccmp.org> Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>

On Feb 19, 2014, at 2:01 PM, Richard <legalize+jeeves@mail.xmission.com> wrote:
[Please do not mail me a copy of your followup]
boost-users@lists.boost.org spake the secret code <4075154.HSbG0KVyOh@arlin17> thusly:
BOOST_AUTO_TEST_CASE( mpi_functionality_parenv_init ) { MPI_Init(&m_argc,&m_argv); }
BOOST_AUTO_TEST_CASE( mpi_functionality_parenv_finalize ) { MPI_Finalize(); }
These aren't separate test cases, they are setup/teardown for an individual test case.
<http://www.open-mpi.org/doc/v1.4/man3/MPI_Init.3.php> and <http://www.open-mpi.org/doc/v1.4/man3/MPI_Finalize.3.php> indicate that these two functions should always be paired.
Furthermore, not even MPI_Init can be called after MPI_Finalize. Therefore, you can't call these pairs of functions once per test case or once per fixture, you can only call them once per process.
Therefore, you probably want to write your own implementation of main() that handles MPI_Init and MPI_Finalize before calling the test runner.
A global fixture seems like a more appropriate solution than implementing main(). BOOST_GLOBAL_FIXTURE http://www.boost.org/doc/libs/1_55_0/libs/test/doc/html/utf/user-guide/fixtu...

A global fixture seems like a more appropriate solution than implementing main().
BOOST_GLOBAL_FIXTURE
Indeed. Something like the following works just fine: struct MPIFixture { MPIFixture() { MPI_Init(NULL, NULL); } ~MPIFixture() { MPI_Finalize(); } }; BOOST_GLOBAL_FIXTURE(MPIFixture); - Rhys

Hi, it doesn't work for me. I tried the following code: /// Generate automatically the 'main' function for the test module #define BOOST_TEST_DYN_LINK #define BOOST_TEST_MODULE mpi_test #include <boost/test/unit_test.hpp> #include <iostream> #include "mpi.h" struct MPIFixture { MPIFixture() { std::cout << "Global fixture constructor:" << std::endl; int argc = boost::unit_test::framework::master_test_suite().argc; char** argv = boost::unit_test::framework::master_test_suite().argv; MPI_Init(&argc,&argv); int is_initialized; int is_finalized; MPI_Initialized(&is_initialized); MPI_Finalized(&is_finalized); std::cout << "MPI environment is initialized: " << is_initialized << std::endl; std::cout << "MPI environment is finalized: " << is_finalized << std::endl; } ~MPIFixture() { std::cout << "Global fixture destructor" << std::endl; int is_initialized; int is_finalized; MPI_Initialized(&is_initialized); MPI_Finalized(&is_finalized); std::cout << "MPI environment is initialized: " << is_initialized << std::endl; std::cout << "MPI environment is finalized: " << is_finalized << std::endl; MPI_Finalize(); MPI_Initialized(&is_initialized); MPI_Finalized(&is_finalized); std::cout << "MPI environment is initialized: " << is_initialized << std::endl; std::cout << "MPI environment is finalized: " << is_finalized << std::endl; } }; struct MyTestSuiteFixture { }; BOOST_GLOBAL_FIXTURE( MPIFixture ) BOOST_FIXTURE_TEST_SUITE( MyTestSuite, MyTestSuiteFixture ) BOOST_AUTO_TEST_CASE( dummy_test ) { std::cout << "Running dummy test case" << std::endl; } BOOST_AUTO_TEST_SUITE_END() The output when running with mpirun: Global fixture constructor: MPI environment is initialized: 1 MPI environment is finalized: 0 Running 1 test case... Running dummy test case Global fixture destructor MPI environment is initialized: 1 MPI environment is finalized: 0 MPI environment is initialized: 1 MPI environment is finalized: 1 Output when running directly the executable: Global fixture constructor: MPI environment is initialized: 1 MPI environment is finalized: 0 Running 1 test case... Running dummy test case Global fixture destructor MPI environment is initialized: 1 MPI environment is finalized: 0 Segmentation fault (core dumped) I'm using boost 1.55.0 and gcc 4.8.2 on linux. Best regards, Martin Vymazal On Friday 21 February 2014 10:38:21 Rhys Ulerich wrote:
A global fixture seems like a more appropriate solution than implementing main().
BOOST_GLOBAL_FIXTURE
Indeed. Something like the following works just fine:
struct MPIFixture { MPIFixture() { MPI_Init(NULL, NULL); } ~MPIFixture() { MPI_Finalize(); } }; BOOST_GLOBAL_FIXTURE(MPIFixture);
- Rhys _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

it doesn't work for me.
To be sure I understand because you were vague... You believe it shouldn't segfault when the binary is executed without using mpirun? If it's even possible depends on your MPI stack. There's zero guarantee in the MPI standard, IIRC, that an MPI-based binary can be executed without mpirun. The latter case does not segfault for me on MPICH2 1.4.1p1, gcc 4.6.3, Boost 1.5.1. I suggest you attach debugger and isolate the origin of the segfault. - Rhys

What bothers me is the fact that it doesn't segfault (with or without mpirun) as a 'classical' executable with main() function, but it crashes when I run it as a boost test without mpirun. I must admit I didn't know that there's no guarantee that this should actually work without mpirun and maybe I'm complaining about a problem where there isn't any. I ran the executable with gdb and curiously enough, it terminated correctly without reporting any problems. I also tried valgrind to see if I get any memory errors. The segfault happens when I call MPI_Finalize() despite the fact that mpi environment has been initialized but not finalized yet. The output is below. Martin ==11986== Command: ./utest-mpi ==11986== Global fixture constructor: ==11986== Syscall param writev(vector[...]) points to uninitialised byte(s) ==11986== at 0x14F0D9E7: writev (in /usr/lib/libc-2.19.so) ==11986== by 0x1790BF72: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:249) ==11986== by 0x1790D0B3: mca_oob_tcp_peer_send (oob_tcp_peer.c:204) ==11986== by 0x179109BB: mca_oob_tcp_send_nb (oob_tcp_send.c:167) ==11986== by 0x172FDC5A: orte_rml_oob_send (rml_oob_send.c:136) ==11986== by 0x172FE228: orte_rml_oob_send_buffer (rml_oob_send.c:270) ==11986== by 0x17D1B7BF: modex (grpcomm_bad_module.c:573) ==11986== by 0x5577324: ompi_mpi_init (ompi_mpi_init.c:541) ==11986== by 0x558E7D2: PMPI_Init (pinit.c:84) ==11986== by 0x40E52B: boost::unit_test::ut_detail::global_fixture_impl<MPIFixture>::test_start(unsigned long) (utest-Poisson.cpp:509) ==11986== by 0x6A44763: boost::unit_test::ut_detail::callback0_impl_t<int, boost::unit_test::ut_detail::test_start_caller>::invoke() (in /usr/lib/libboost_unit_test_framework.so.1.55.0) ==11986== by 0x6A36175: boost::execution_monitor::catch_signals(boost::unit_test::callback0<int> const&) (in /usr/lib/libboost_unit_test_framework.so.1.55.0) ==11986== Address 0x164d2341 is 161 bytes inside a block of size 256 alloc'd ==11986== at 0x4C2AA3E: realloc (in /usr/lib/valgrind/vgpreload_memcheck- amd64-linux.so) ==11986== by 0x56060F7: opal_dss_buffer_extend (dss_internal_functions.c:63) ==11986== by 0x560650D: opal_dss_copy_payload (dss_load_unload.c:164) ==11986== by 0x55DACC2: orte_grpcomm_base_pack_modex_entries (grpcomm_base_modex.c:861) ==11986== by 0x17D1B6CE: modex (grpcomm_bad_module.c:563) ==11986== by 0x5577324: ompi_mpi_init (ompi_mpi_init.c:541) ==11986== by 0x558E7D2: PMPI_Init (pinit.c:84) ==11986== by 0x40E52B: boost::unit_test::ut_detail::global_fixture_impl<MPIFixture>::test_start(unsigned long) (utest-Poisson.cpp:509) ==11986== by 0x6A44763: boost::unit_test::ut_detail::callback0_impl_t<int, boost::unit_test::ut_detail::test_start_caller>::invoke() (in /usr/lib/libboost_unit_test_framework.so.1.55.0) ==11986== by 0x6A36175: boost::execution_monitor::catch_signals(boost::unit_test::callback0<int> const&) (in /usr/lib/libboost_unit_test_framework.so.1.55.0) ==11986== by 0x6A369B2: boost::execution_monitor::execute(boost::unit_test::callback0<int> const&) (in /usr/lib/libboost_unit_test_framework.so.1.55.0) ==11986== by 0x6A3FDB1: boost::unit_test::framework::run(unsigned long, bool) (in /usr/lib/libboost_unit_test_framework.so.1.55.0) ==11986== MPI environment is initialized: 1 MPI environment is finalized: 0 Running 1 test case... Running dummy test case Global fixture destructor MPI environment is initialized: 1 MPI environment is finalized: 0 ==11986== Invalid write of size 8 ==11986== at 0x6A358AC: ??? (in /usr/lib/libboost_unit_test_framework.so.1.55.0) ==11986== by 0x14E643FF: ??? (in /usr/lib/libc-2.19.so) ==11986== by 0x14F0D9E6: writev (in /usr/lib/libc-2.19.so) ==11986== by 0x1790BF72: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:249) ==11986== by 0x1790D0B3: mca_oob_tcp_peer_send (oob_tcp_peer.c:204) ==11986== by 0x179109BB: mca_oob_tcp_send_nb (oob_tcp_send.c:167) ==11986== by 0x172FDC5A: orte_rml_oob_send (rml_oob_send.c:136) ==11986== by 0x172FE228: orte_rml_oob_send_buffer (rml_oob_send.c:270) ==11986== by 0x55F6EEC: orte_routed_base_register_sync (routed_base_register_sync.c:86) ==11986== by 0x17B17276: finalize (routed_binomial.c:115) ==11986== by 0x55F64F7: orte_routed_base_close (routed_base_components.c:126) ==11986== by 0x55D6BB4: orte_ess_base_app_finalize (ess_base_std_app.c:265) ==11986== Address 0xa98 is not stack'd, malloc'd or (recently) free'd ==11986== ==11986== ==11986== Process terminating with default action of signal 11 (SIGSEGV) ==11986== Access not within mapped region at address 0xA98 ==11986== at 0x6A358AC: ??? (in /usr/lib/libboost_unit_test_framework.so.1.55.0) ==11986== by 0x14E643FF: ??? (in /usr/lib/libc-2.19.so) ==11986== by 0x14F0D9E6: writev (in /usr/lib/libc-2.19.so) ==11986== by 0x1790BF72: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:249) ==11986== by 0x1790D0B3: mca_oob_tcp_peer_send (oob_tcp_peer.c:204) ==11986== by 0x179109BB: mca_oob_tcp_send_nb (oob_tcp_send.c:167) ==11986== by 0x172FDC5A: orte_rml_oob_send (rml_oob_send.c:136) ==11986== by 0x172FE228: orte_rml_oob_send_buffer (rml_oob_send.c:270) ==11986== by 0x55F6EEC: orte_routed_base_register_sync (routed_base_register_sync.c:86) ==11986== by 0x17B17276: finalize (routed_binomial.c:115) ==11986== by 0x55F64F7: orte_routed_base_close (routed_base_components.c:126) ==11986== by 0x55D6BB4: orte_ess_base_app_finalize (ess_base_std_app.c:265) ==11986== If you believe this happened as a result of a stack ==11986== overflow in your program's main thread (unlikely but ==11986== possible), you can try to increase the size of the ==11986== main thread stack using the --main-stacksize= flag. ==11986== The main thread stack size used in this run was 8388608. ==11986== ==11986== HEAP SUMMARY: ==11986== in use at exit: 530,391 bytes in 4,383 blocks ==11986== total heap usage: 8,298 allocs, 3,915 frees, 13,119,175 bytes allocated ==11986== ==11986== LEAK SUMMARY: ==11986== definitely lost: 5,064 bytes in 34 blocks ==11986== indirectly lost: 5,390 bytes in 22 blocks ==11986== possibly lost: 25,881 bytes in 584 blocks ==11986== still reachable: 494,056 bytes in 3,743 blocks ==11986== suppressed: 0 bytes in 0 blocks ==11986== Rerun with --leak-check=full to see details of leaked memory ==11986== ==11986== For counts of detected and suppressed errors, rerun with: -v ==11986== Use --track-origins=yes to see where uninitialised values come from ==11986== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 2 from 1) Segmentation fault (core dumped) On Friday 21 March 2014 08:50:25 Rhys Ulerich wrote:
it doesn't work for me.
To be sure I understand because you were vague... You believe it shouldn't segfault when the binary is executed without using mpirun?
If it's even possible depends on your MPI stack. There's zero guarantee in the MPI standard, IIRC, that an MPI-based binary can be executed without mpirun.
The latter case does not segfault for me on MPICH2 1.4.1p1, gcc 4.6.3, Boost 1.5.1.
I suggest you attach debugger and isolate the origin of the segfault.
- Rhys _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

What bothers me is the fact that it doesn't segfault (with or without mpirun) as a 'classical' executable with main() function, but it crashes when I run it as a boost test without mpirun.
Any difference in behavior if you muck with the various header-only vs linked Boost.Test configurations? In the linking scenario, any difference if you manually move -lboost_unit_test_framework around relative to what `mpicxx -show` tells you? On Fri, Mar 21, 2014 at 10:06 AM, Martin Vymazal <martin.vymazal@vki.ac.be> wrote:
==11986== by 0x5577324: ompi_mpi_init (ompi_mpi_init.c:541)
Having recently been by OpenMPI vs MPICH issues in another context, give MPICH a shot. Glad you've got it somewhat isolated. I'm sorry but I've currently not got the time to dig further into this. Best of luck. - Rhys

[Please do not mail me a copy of your followup] boost-users@lists.boost.org spake the secret code <EF0D056B-3640-465C-A404-78D62F563E2C@verizon.net> thusly:
A global fixture seems like a more appropriate solution than implementing main().
BOOST_GLOBAL_FIXTURE http://www.boost.org/doc/libs/1_55_0/libs/test/doc/html/utf/user-guide/fixtu...
A global fixture doesn't work quite the way you think it does. It is possible for test cases to execute before your global fixture has been instantiated. That is, a global fixture is only "global" for a single translation unit and not multiple translation units. Therefore, I recommend using a custom implementation of main instead of a global fixture, since this *really* should only be done once per process and not the way global fixtures work. -- "The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline> The Computer Graphics Museum <http://computergraphicsmuseum.org> The Terminals Wiki <http://terminals.classiccmp.org> Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>

Richard <legalize+jeeves <at> mail.xmission.com> writes:
[Please do not mail me a copy of your followup]
boost-users <at> lists.boost.org spake the secret code <EF0D056B-3640-465C-A404-78D62F563E2C <at> verizon.net> thusly:
A global fixture seems like a more appropriate solution than implementing main().
BOOST_GLOBAL_FIXTURE http://www.boost.org/doc/libs/1_55_0/libs/test/doc/html/utf/user-
guide/fixture/global.html
A global fixture doesn't work quite the way you think it does. It is possible for test cases to execute before your global fixture has been instantiated.
No. It is not.
That is, a global fixture is only "global" for a single translation unit and not multiple translation units.
No. This is not true. Global fixture is indeed global and is executed once per test module regardless which test file it is defined in. And it is done before the testing begins. I'd appreciate if you do not mislead the users with information you are not sure about.

[Please do not mail me a copy of your followup] boost-users@lists.boost.org spake the secret code <loom.20140520T095957-837@post.gmane.org> thusly:
That is, a global fixture is only "global" for a single translation unit and not multiple translation units.
No. This is not true. Global fixture is indeed global and is executed once per test module regardless which test file it is defined in. And it is done before the testing begins.
By "test module", I assume you are referring to the executable. I don't know why we need another term for this, but that's a different discussion. I'll do my testing again, but this is inconsistent with what I observed from this example: I agree that what you describe is how it was documented, but it wasn't what I observed. Look at the source code here: <http://user.xmission.com/~legalize/boost.test/libs/test/doc/html/test/reference/test_case/boost_global_fixture.html> When I ran those test cases, I would see some test cases printing their output before all the global fixtures were created. I will try again when I get home. -- "The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline> The Computer Graphics Museum <http://computergraphicsmuseum.org> The Terminals Wiki <http://terminals.classiccmp.org> Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>

[Please do not mail me a copy of your followup] boost-users@lists.boost.org spake the secret code <llg9ls$cb2$1@ger.gmane.org> thusly:
I'll do my testing again, but this is inconsistent with what I observed from this example:
When I got the unexpected behavior of a test case executing before the global fixture was initialized, that was in my work environment on a more complicated uni testing project. Using the simple example I created for the documentation, I am unable to reproduce that behavior, but there is still some unexpected behavior. Using these two source files in an executable: <http://user.xmission.com/~legalize/boost.test/libs/test/doc/src/examples/global_fixture.cpp> <http://user.xmission.com/~legalize/boost.test/libs/test/doc/src/examples/another_global_fixture.cpp> I have the following in the code: global_fixture.cpp: BOOST_GLOBAL_FIXTURE(global_fixture) BOOST_FIXTURE_TEST_SUITE(suite, suite_fixture) BOOST_AUTO_TEST_CASE(one) BOOST_GLOBAL_FIXTURE(global_fixture2) BOOST_AUTO_TEST_CASE(two) BOOST_AUTO_TEST_SUITE_END() another_global_fixture.cpp: BOOST_GLOBAL_FIXTURE(global3) BOOST_AUTO_TEST_CASE(three) BOOST_GLOBAL_FIXTURE(global4) BOOST_AUTO_TEST_CASE(four) I get this output (each fixture does a printout in c'tor/d'tor): global setup global2 setup global4 setup global3 setup Running 4 test cases... suite setup one suite teardown suite setup two suite teardown three four global teardown global2 teardown global4 teardown global3 teardown *** No errors detected What was unexpected was that the global fixtures didn't execute in the order they were encountered in the translation unit(s) (notice that global3 was in the source file before global4, but they were initialized in the reverse order, while the fixtures in the other source file were initialized in the order encountered) and that when multiple global fixtures are present, they aren't torn down in the reverse order in which they are setup, but in the same order in which they are setup. So we don't get symmetric setup/teardown like we do with a test case fixture. In other words, we don't get: global setup global2 setup global4 setup global3 setup ...tests global3 teardown global4 teardown global2 teardown global teardown Since the global fixture registration is done via static initializers, the order in which they are initialized relative to each other is undefined across translation units. C++ makes no guarantees about the order of static initializers between translation units. Furthermore, the global fixture registration creates the global fixtures as observers of the test tree with equal priority and all observers are stored in a std::set with a comparison function that uses the priority. However, since they all have priority zero, then there is no specific ordering to them. This is why global4 is setup before global3, even though global3 appears in the source file before global4. Add this all up and I can't really recommend global fixtures. They only work as you might expect from using test case and test suite fixtures when there is exactly one of them. -- "The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline> The Computer Graphics Museum <http://computergraphicsmuseum.org> The Terminals Wiki <http://terminals.classiccmp.org> Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>
participants (5)
-
Gennadiy Rozental
-
Kim Barrett
-
legalize+jeeves@mail.xmission.com
-
Martin Vymazal
-
Rhys Ulerich