Boost logo

Boost Users :

Subject: Re: [Boost-users] [EXTERNAL] bjam hangs on select (in develop branch)
From: Alain Miniussi (alain.miniussi_at_[hidden])
Date: 2014-10-27 12:13:40


On 27/10/2014 16:32, Alain Miniussi wrote:
> On 25/10/2014 02:14, Belcourt, Kenneth wrote:
>> On Oct 24, 2014, at 4:52 PM, Belcourt, Kenneth <kbelco_at_[hidden]>
>> wrote:
>>
>>> On Oct 24, 2014, at 4:43 PM, Belcourt, Kenneth <kbelco_at_[hidden]>
>>> wrote:
>>>
>>>> On Oct 24, 2014, at 4:33 PM, Belcourt, Kenneth <kbelco_at_[hidden]>
>>>> wrote:
>>>>
>>>>> On Oct 24, 2014, at 7:56 AM, Alain Miniussi
>>>>> <alain.miniussi_at_oca.eu> wrote:
>>>>>
>>>>>> On 24/10/2014 15:33, Alain Miniussi wrote:
>>>>>>> I did a gnu/openmpi 1.8.2 build on ubuntu which exhibit the same
>>>>>>> problem.
>>>>>> It did not, just forgot to edit a field in project-config.jam.
>>>>>> Only the intel mpiexec/run hangs.
>>>>>>> Can the fact that the setgpid system calls fails be an issue ?
>>>>> Perhaps. We make the forked child process it’s own process group
>>>>> leader so that if it’s an MPI job and it dies, all the MPI ranks
>>>>> are cleaned up as well. We’ve been using this syntax for a number
>>>>> of years on multiple platforms without issues so I’m a little
>>>>> surprised it fails on Ubuntu with OpenMPI 1.8.2 That said, it’s
>>>>> possible that there’s a race condition that you’re able to tickle.
>>>>>
>>>>> For example, we fork the child process and right before we exec
>>>>> the child process, we set the child process group. We also set the
>>>>> child process group in the parent process as well. Perhaps we
>>>>> should on do this once, not twice (i.e. only in the child or only
>>>>> in the parent, not both). Or perhaps there’s a race if both the
>>>>> child and parent call to setpgid runs concurrently.
>>>> Just pushed this commit, 7bcbc5ac31ab1, to develop which adds
>>>> checks to the setpgid calls and, if they fail, indicates whether it
>>>> was the parent or child process who called. Can you give this a
>>>> try and let me know which call is failing?
>>> Well I be danged. I was just testing thie change on my Mac and
>>> found this in the output:
>>>
>>> setpgid (parent): Permission denied
>>>
>>> So it seems we’ve been ignoring this problem for some time and
>>> didn’t know it. That would be my bad. Let me work on a fix (will
>>> probably remove the duplicate call in the parent process).
>> I left both setpgid checks in, but removed the call to exit() so
>> we’ll see the failed call to setpgid without killing b2.
>>
>> commit 156bc5c42ec3 in develop.
>
> Thanks,
>
> So the mpiexe.hydra is still defunct *but* I have something new:
> Let say I am in the following situation:
>
> PID PPID
> 20104 alainm 20 0 S 0.0 0:13.33 17184 bjam
> 20170 alainm 20 0 T 0.0 0:00.00 20104 sh
> 20171 alainm 20 0 Z 0.0 0:00.00 20170 mpiexec.hydra <defunct>
> [alainm_at_gurney ~]$ pstree 20104
> bjam───sh───mpiexec.hydra
> [alainm_at_gurney ~]$
>
> So, mpiexe is dead, the calling shell should take notice, but somehow
> doesn't. It just wait, but with no conviction:
> $ gdb /bin/sh 20170
> ................
> (gdb) bt
> #0 0x0000003bd92ac8ce in __libc_waitpid (pid=-1,
> stat_loc=0x7fff344888bc, options=0)
> at ../sysdeps/unix/sysv/linux/waitpid.c:32
> #1 0x000000000043ec82 in waitchld (wpid=<value optimized out>,
> block=1) at jobs.c:3064
> #2 0x000000000043ff1f in wait_for (pid=20171) at jobs.c:2422
> #3 0x00000000004309f9 in execute_command_internal (command=0x18beda0,
>
> the interesting thing is that, if y just entre a <continue> command
> under gdb, then the bjam magically proceed up to the next mpiexec.
>
> Which gave me the idea to just
> $ kill -CONT <shell id>
> to get see the next target proceed.
>
> So my current theory is that the mpiexec.hydra pauses it calling
> process by sending it a STOP signal (why would it do that ? I have no
> clue) and then exit without sending a CONTINUE signal.
>
> Maybe signaling the child from exec_cmd just before the select would
> be a solution, but it looks like a pretty ugly one...

Ok, I might have a fix (nothing to be proud of though) that basically
consist in inserting:
      for ( i = 0; i < globs.jobs; ++i ) {
               if ( cmdtab[ i ].pid != 0 ) {
                   kill(cmdtab[ i ].pid, SIGCONT);
               }
           }
at the beginning of exec_wait.
Maybe killpg would be better (didn't check), since I'm not sure that a
simple kill will deal with mpirun (which add a shell layer between bjam
and mpiexec.hydra).
I'll try to propose a pull request tonight.

Thanks !

Alain

>
> Alain
>
>
>>
>> — Noel
>>
>> _______________________________________________
>> Boost-users mailing list
>> Boost-users_at_[hidden]
>> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>
>

-- 
---
Alain

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net