Boost logo

Boost Users :

Subject: Re: [Boost-users] [EXTERNAL] bjam hangs on select (in develop branch)
From: Alain Miniussi (alain.miniussi_at_[hidden])
Date: 2014-10-20 18:02:21


Hi Noel,

No, no -j option.

I tried the -p (since bjam is hangin in a select on output streams) with
no effect.
I don't know if that's relevant but it seems that most calls to setpgid
(and those on the sh process) sets errno to 13 (permission problem)..
The select is waiting (without -p) on the stdout of the 'sh' process
(wit the redirected stderr).
If I replace mpiexec.hydra (a binary) with mpirun (a wrapper around that
binary) only mpiexec.hydra will be defunct.

   PID USER PR NI S %CPU TIME+ PPID COMMAND
    769 alainm 20 0 S 0.0 0:02.79 768 bjam
  1028 alainm 20 0 T 0.0 0:00.00 769 sh
  1029 alainm 20 0 T 0.0 0:00.00 1028 mpirun
  1034 alainm 20 0 Z 0.0 0:00.00 1029 mpiexec.hydra <defunct>

Alain

On 20/10/2014 19:10, Belcourt, Kenneth wrote:
> Hi Alian,
>
> I’ve seen this problem before but it appears to affect very few people so I’ve not needed to fix it. Perhaps the time has come to address it.
>
> Was bjam passed a -j option, if so, what was it?
>
> — Noel
>
> On Oct 20, 2014, at 9:33 AM, Alain Miniussi <alain.miniussi_at_oca.eu> wrote:
>
>> Hi,
>>
>> I am trying to test Boost.MPI with Intel's implementation and I am stuck while trying to run simple tests through bjam.
>> Bjam is hangs on the select (not pselect ?) instruction of the unix exec_wait.
>> As far as processes are concerned:
>>
>> PID USER PR NI S %CPU TIME+ PPID COMMAND
>> .......................
>> 16882 alainm 20 0 S 0.0 0:01.61 6507 bjam
>> 16899 alainm 20 0 T 0.0 0:00.00 16882 sh
>> 16900 alainm 20 0 Z 0.0 0:00.00 16899 mpiexec.hydra <defunct>
>> .......
>>
>> bjam calls a generated shell (below) which calls a mpiexe.hydra which work perfectly fine outside bjam.
>> The mpiexec.hydra dies the the shell refuses to let it go.
>>
>> the shell script, generated by bjam, is:
>>
>> ===============================================
>> [alainm_at_gurney engine]$ more /proc/16899/cmdline
>> /bin/sh
>> LD_LIBRARY_PATH="/gpfs/scratch/alainm/view/boost/bin.v2/libs/mpi/build/intel-linux/debug:/gpfs/scratch/alainm/view/boost/bin.v2/libs/serialization/build/intel-linux/debug:/softs/
>> intel/composer_xe_2015.0.090/bin/lib:/softs/intel/composer_xe_2015.0.090/lib/intel64:$LD_LIBRARY_PATH"
>> export LD_LIBRARY_PATH
>>
>> status=0
>> if test $status -ne 0 ; then
>> echo Skipping test execution due to testing.execute=off
>> exit 0
>> fi
>> mpiexec.hydra -n 2 "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2" blob > "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.te
>> st/intel-linux/debug/broadcast_stl_test-2-run.output" 2>&1
>> status=$?
>> echo >> "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output"
>> echo EXIT STATUS: $status >> "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output"
>> if test $status -eq 0 ; then
>> cp "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-
>> linux/debug/broadcast_stl_test-2-run"
>> fi
>> verbose=0
>> if test $status -ne 0 ; then
>> verbose=1
>> fi
>> if test $verbose -eq 1 ; then
>> echo ====== BEGIN OUTPUT ======
>> cat "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output"
>> echo ====== END OUTPUT ======
>> fi
>> exit $status
>>
>> [alainm_at_gurney engine]$
>> =================================================
>>
>>
>> Note that select only test for the subprocess output, at the hanging point mpiexec.hydra is done with its outputs.
>>
>> Any idea ?
>>
>> Alain
>>
>> PS: there was a cmake based project some time ago, is it still active or is bjam here to stay ?
>>
>> _______________________________________________
>> Boost-users mailing list
>> Boost-users_at_[hidden]
>> http://lists.boost.org/mailman/listinfo.cgi/boost-users
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users

-- 
---
Alain

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net