mlm
2008-03-09 01:25:53 UTC
I'm using torque and mpich2 on a 64 node cluster
I am submitting a job through using qsub that calls mpiexec. My
intent is to have the same program run on 2 different nodes. These
two programs will then talk to each other using MPI. Here are some of
my settings:
#PBS -l nodes=2:ppn=1
...
mpiexec -l -n 2 myprog
In my script, I print out the PBS_NODEFILE (using cat $PBS_NODEFILE),
and I'm seeing that two nodes are listed. However, both jobs are
placed on one node (the first node listed in PBS_NODEFILE. The
programs run without any problems, but I just want them to run on
separate nodes due to their memory usage.
Can anyone help me out? Is this an mpich or pbs problem?
Thanks
I am submitting a job through using qsub that calls mpiexec. My
intent is to have the same program run on 2 different nodes. These
two programs will then talk to each other using MPI. Here are some of
my settings:
#PBS -l nodes=2:ppn=1
...
mpiexec -l -n 2 myprog
In my script, I print out the PBS_NODEFILE (using cat $PBS_NODEFILE),
and I'm seeing that two nodes are listed. However, both jobs are
placed on one node (the first node listed in PBS_NODEFILE. The
programs run without any problems, but I just want them to run on
separate nodes due to their memory usage.
Can anyone help me out? Is this an mpich or pbs problem?
Thanks