Irfan Khan
2007-11-17 01:52:55 UTC
Hi
I am facing a strange problem with running my code on a 16 node Rocks
cluster.
1. I can do a passwordless login without any problem
2. I run a sample hello world program and a simple data send_recv code
without any problem.
3. Now I try to run my code on 2 nodes and I get the following error.
"Connection to compute-0-0 closed by remote host"
I have successfully run my code several times on other clusters without
any problem.
There is another interesting aspect to this problem. The code takes in
the name of the input file as an argument. But when I do provide it, I
get the following error message.
Please notice that process 2 is trying to open a file by name joshi_hpcc1.in
which is the hostname of the computer. The process should be trying to
open a file called a1.in. Can the argument "argc" and "argv" be used to
give arguments to the program?
I would appreciate any suggestions, comments, advice on this issue.
Thanks a lot in advance
[***@joshi_hpcc working_dir]$ mpirun -np 2 -machinefile machines ./lbm3dp a1
mpi version Parallel LBM, amended ONGOING
x and z divisions 2 1
xmin = -1, xmax = 100 lx = 102
ymin = 0, ymax = 199 lx = 200
zmin = 0, zmax = 199 lx = 200
mpi version Parallel LBM, amended ONGOING
Error opening file: joshi_hpcc1.in
p_lattbes=iteration1_3d
Navier-Stokes flow: p_feq_3d=feq_3d
Starting a new simulation
1, 0 warnings/errors
Irfan
I am facing a strange problem with running my code on a 16 node Rocks
cluster.
1. I can do a passwordless login without any problem
2. I run a sample hello world program and a simple data send_recv code
without any problem.
3. Now I try to run my code on 2 nodes and I get the following error.
"Connection to compute-0-0 closed by remote host"
I have successfully run my code several times on other clusters without
any problem.
There is another interesting aspect to this problem. The code takes in
the name of the input file as an argument. But when I do provide it, I
get the following error message.
Please notice that process 2 is trying to open a file by name joshi_hpcc1.in
which is the hostname of the computer. The process should be trying to
open a file called a1.in. Can the argument "argc" and "argv" be used to
give arguments to the program?
I would appreciate any suggestions, comments, advice on this issue.
Thanks a lot in advance
[***@joshi_hpcc working_dir]$ mpirun -np 2 -machinefile machines ./lbm3dp a1
mpi version Parallel LBM, amended ONGOING
x and z divisions 2 1
xmin = -1, xmax = 100 lx = 102
ymin = 0, ymax = 199 lx = 200
zmin = 0, zmax = 199 lx = 200
mpi version Parallel LBM, amended ONGOING
Error opening file: joshi_hpcc1.in
p_lattbes=iteration1_3d
Navier-Stokes flow: p_feq_3d=feq_3d
Starting a new simulation
1, 0 warnings/errors
Irfan