Discussion:
mpirun error for large grid
(too old to reply)
r***@gmail.com
2009-03-13 21:35:56 UTC
Permalink
I am learning MPI, and I encountered an error which I need some help
solving.
Here is my error:
1 mpispawn.c:303 Unexpected exit status
2
3 Child exited abnormally!
4 Killing remote processes...DONE

And here is a explanation of what I am doing:
I worked on my code where it can be run and I receive no errors in the
"Intelexcite.oPROCESSNUMBER" file for values of (ie,je,ke) ,up to
600. i,j,k are values for a 3-D grid. Once I increase (ie,je,ke)
to 650, I receive the error.

At this point, the program will not write out any values. For values
of (ie,je,ke) 600 and under, I do not incur any issues. Do you have
any suggestion of how I could avoid or solve this issue.

This is my submit.sh file:
#PBS -N cpmlSNARF
#PBS -l nodes=1:ppn=8
#PBS -l walltime=01:00:00
cd $PBS_O_WORKDIR
/users/system/intel/mvapich-gen2/bin/mpirun -machinefile $PBS_NODEFILE
-np 8 a.out
none
2009-03-16 07:29:31 UTC
Permalink
Post by r***@gmail.com
I am learning MPI, and I encountered an error which I need some help
solving.
1 mpispawn.c:303 Unexpected exit status
2
3 Child exited abnormally!
4 Killing remote processes...DONE
I worked on my code where it can be run and I receive no errors in the
"Intelexcite.oPROCESSNUMBER" file for values of (ie,je,ke) ,up to
600. i,j,k are values for a 3-D grid. Once I increase (ie,je,ke)
to 650, I receive the error.
At this point, the program will not write out any values. For values
of (ie,je,ke) 600 and under, I do not incur any issues. Do you have
any suggestion of how I could avoid or solve this issue.
#PBS -N cpmlSNARF
#PBS -l nodes=1:ppn=8
#PBS -l walltime=01:00:00
cd $PBS_O_WORKDIR
/users/system/intel/mvapich-gen2/bin/mpirun -machinefile $PBS_NODEFILE
-np 8 a.out
Just an idea: if your 3D grid contains doubles (8 bytes), then
600*600*600*8 ~ 1.7GB, whereas 650*650*650*8 ~ 2.1GB.
Now, as you try to run 8 process on a single compute node, I guess that
this one as 16GB and that the 650 case runs out of memory...
Gilles
r***@gmail.com
2009-03-31 02:12:23 UTC
Permalink
Post by none
Post by r***@gmail.com
I am learning MPI, and I encountered an error which I need some help
solving.
 1 mpispawn.c:303 Unexpected exit status
 2
 3 Child exited abnormally!
 4 Killing remote processes...DONE
I worked on my code where it can be run and I receive no errors in the
"Intelexcite.oPROCESSNUMBER"  file for values of (ie,je,ke) ,up to
600.   i,j,k are values for a 3-D grid.  Once I increase  (ie,je,ke)
to 650, I receive the error.
At this point, the program will not write out any values.  For values
of (ie,je,ke) 600 and under, I do not incur any issues.  Do you have
any suggestion of how I could avoid or solve this issue.
#PBS -N cpmlSNARF
#PBS -l nodes=1:ppn=8
#PBS -l walltime=01:00:00
cd $PBS_O_WORKDIR
/users/system/intel/mvapich-gen2/bin/mpirun -machinefile $PBS_NODEFILE
-np 8 a.out
Just an idea: if your 3D grid contains doubles (8 bytes), then
600*600*600*8 ~ 1.7GB, whereas 650*650*650*8 ~ 2.1GB.
Now, as you try to run 8 process on a single compute node, I guess that
this one as 16GB and that the 650 case runs out of memory...
Gilles
Giles, You are correct about running out of memory. I posted what I
believe the issue was at this site: http://www.ece.unm.edu/~etanner/MPI.html
Loading...