Discussion:
Abrupt closure of process
(too old to reply)
nithesh
2009-05-14 11:10:34 UTC
Permalink
HI,

I am a novice in parallel progrmming. I was testing Message Passing
Interface. and found that when i run multiple process using mpirun and
for some reason if one of the process is abruptly closed then the
following message is displayed on the screen and all the process are
terminated.

the message displayed is
"mpirun has exited due to to process rank 3 with PID 15887" on node
linuxtest2 exiting without
calling "finalize". This may have caused other processes in the
application to be terminated by
signals sent by mpirun(as reported here)."

Is there any way to avoid this i mean if one process is terminated
then mpirun shouldn't exit and the rest of the process should continue
running.

Awaiting your reply.

Nithesh
Jomar Bueyes
2009-05-14 14:46:05 UTC
Permalink
Post by nithesh
HI,
I am a novice in parallel progrmming. I was testing Message Passing
Interface. and found that when i run multiple process using mpirun and
for some reason if one of the process is abruptly closed then the
following message is displayed  on the screen and all the process are
terminated.
the message displayed is
"mpirun has exited due to to process rank 3 with PID 15887" on node
linuxtest2 exiting without
calling "finalize". This may have caused other processes in the
application to be terminated by
signals sent by mpirun(as reported here)."
Is there any way to avoid this i mean if one process is terminated
then mpirun shouldn't exit and the rest of the process should continue
running.
Awaiting your reply.
Nithesh
Hi Nithesh,

The solution depends on why one process abruptly stops. If the process
stops due to a terminal error, there is not much you can do unless you
have a mechanism to trap exceptions. If you can trap exceptions, you
could insert at several places code that makes all processes check for
an exception. If any process finds one, all processes call
MPI_FINALIZE and stop.
On the other hand, if the process stops abruptly due to a stop
(Fortran) or exit (C, C++) instruction, then you should revise the
code to place stop or exit instructions only in sections of code that
all processes execute.

HTH,

Jomar
nithesh
2009-05-15 04:16:31 UTC
Permalink
Hi Jomar,

Thanks for your advice.

Nithesh
Post by Jomar Bueyes
Post by nithesh
HI,
I am a novice in parallel progrmming. I was testing Message Passing
Interface. and found that when i run multiple process using mpirun and
for some reason if one of the process is abruptly closed then the
following message is displayed  on the screen and all the process are
terminated.
the message displayed is
"mpirun has exited due to to process rank 3 with PID 15887" on node
linuxtest2 exiting without
calling "finalize". This may have caused other processes in the
application to be terminated by
signals sent by mpirun(as reported here)."
Is there any way to avoid this i mean if one process is terminated
then mpirun shouldn't exit and the rest of the process should continue
running.
Awaiting your reply.
Nithesh
Hi Nithesh,
The solution depends on why one process abruptly stops. If the process
stops due to a terminal error, there is not much you can do unless you
have a mechanism to trap exceptions. If you can trap exceptions, you
could insert at several places code that makes all processes check for
an exception. If any process finds one, all processes call
MPI_FINALIZE and stop.
On the other hand, if the process stops abruptly due to a stop
(Fortran) or exit (C, C++) instruction, then you should revise the
code to place stop or exit instructions only in sections of code that
all processes execute.
HTH,
Jomar- Hide quoted text -
- Show quoted text -
Loading...