mkbane
2009-02-15 22:08:28 UTC
I believe the attached code should work correctly but I find that
using
the mpich2 implementation on one particular box it hangs for odd
numbers
of processors (not every time but frequently), whereas this code runs
fine on another box I've tried (albeit with OpenMPI)
Details are below and I'd welcome suggestions as to the cause of the
problem. Note that if I add WRITE statements or use the debugger the
problem appears to go away. Adding a FLUSH and BARRIER immediately
after
the WRITE stmt makes no difference.
To confuse myself further, if I replace the MPI_Send() by MPI_SSend(),
ie synchronous, sometimes the code completes whereas other times it
appears to hang (see sync.out example at end)
Thanks, Michael
~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ cat
mkb_ring_solution_send_and_recv_portable.f90
PROGRAM ring
! this program will work on all MPI implementations
USE MPI
IMPLICIT NONE
! since we're only sending a single message between any src/dest pair
we
can use a single tag
INTEGER, PARAMETER :: myTag=101
INTEGER :: ierror, inputRank, myRank, size
INTEGER :: sendTo, recvFrom
INTEGER :: recv_status(MPI_STATUS_SIZE)
! initialise MPI
CALL MPI_INIT(ierror)
! determine my rank and total size
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myRank, ierror)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror)
! set up which process rank is to my right (ie clockwise) for sending
sendTo = myRank + 1
IF (sendTo == size) sendTo = 0
recvFrom = myRank - 1
IF (recvFrom == -1) recvFrom = size-1
! send my rank clockwise (from recvFrom to sendTo)
! to ensure nobody everybody is sending (and possibly waiting) at the
same time, we split into even (send then recv)
! and odd (recv then send)
if (mod(myRank,2)==0) then
call mpi_send(myRank,1,MPI_INTEGER,sendTo, myTag, &
MPI_COMM_WORLD,ierror)
call mpi_recv(inputRank,1,MPI_INTEGER,recvFrom,myTag, &
MPI_COMM_WORLD,recv_status,ierror)
else
call mpi_recv(inputRank,1,MPI_INTEGER,recvFrom,myTag, &
MPI_COMM_WORLD,recv_status,ierror)
call mpi_send(myRank,1,MPI_INTEGER,sendTo, myTag, &
MPI_COMM_WORLD,ierror)
endif
write(*,*) 'i am #',myRank,' and I received a new
rank=',inputRank
CALL MPI_FINALIZE(ierror)
END PROGRAM ring
~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ mpif90 -O0
mkb_ring_solution_send_and_recv_portable.f90;mpif90 -show
ifort: Command line warning: overriding '-O3' with '-O0'
ifort -O3 -I/opt/mpi/mpibull2-0.9.7-2.t_RC4v4.3/include
-L/home/horace/mccssmb2/.mpibull2/lib
-L/opt/mpi/mpibull2-0.9.7-2.t_RC4v4.3/lib -lmpidev -lmpi -lrt -ldl
-lelan -lelanctrl -lcpuset
~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ date;prun -O -n 5 -p
login ./a.out
Sun Feb 15 19:06:56 GMT 2009
Using qxelan driver, build for MPIBull2 0.9.7-t (Ishtar) 20060726-1607
i am # 1 and I received a new rank= 0
i am # 2 and I received a new rank= 1
i am # 3 and I received a new rank= 2
i am # 4 and I received a new rank= 3
(CNTL-C)
~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ date
Sun Feb 15 19:23:47 GMT 2009
~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ prun -I -n 5 -p
login ./sync.out
Using qxelan driver, build for MPIBull2 0.9.7-t (Ishtar) 20060726-1607
i am # 1 and I received a new rank= 0
i am # 2 and I received a new rank= 1
[prun -O means over-commit if required, -I means no over-commit but
fail
without running if insufficient resources]
using
the mpich2 implementation on one particular box it hangs for odd
numbers
of processors (not every time but frequently), whereas this code runs
fine on another box I've tried (albeit with OpenMPI)
Details are below and I'd welcome suggestions as to the cause of the
problem. Note that if I add WRITE statements or use the debugger the
problem appears to go away. Adding a FLUSH and BARRIER immediately
after
the WRITE stmt makes no difference.
To confuse myself further, if I replace the MPI_Send() by MPI_SSend(),
ie synchronous, sometimes the code completes whereas other times it
appears to hang (see sync.out example at end)
Thanks, Michael
~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ cat
mkb_ring_solution_send_and_recv_portable.f90
PROGRAM ring
! this program will work on all MPI implementations
USE MPI
IMPLICIT NONE
! since we're only sending a single message between any src/dest pair
we
can use a single tag
INTEGER, PARAMETER :: myTag=101
INTEGER :: ierror, inputRank, myRank, size
INTEGER :: sendTo, recvFrom
INTEGER :: recv_status(MPI_STATUS_SIZE)
! initialise MPI
CALL MPI_INIT(ierror)
! determine my rank and total size
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myRank, ierror)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror)
! set up which process rank is to my right (ie clockwise) for sending
sendTo = myRank + 1
IF (sendTo == size) sendTo = 0
recvFrom = myRank - 1
IF (recvFrom == -1) recvFrom = size-1
! send my rank clockwise (from recvFrom to sendTo)
! to ensure nobody everybody is sending (and possibly waiting) at the
same time, we split into even (send then recv)
! and odd (recv then send)
if (mod(myRank,2)==0) then
call mpi_send(myRank,1,MPI_INTEGER,sendTo, myTag, &
MPI_COMM_WORLD,ierror)
call mpi_recv(inputRank,1,MPI_INTEGER,recvFrom,myTag, &
MPI_COMM_WORLD,recv_status,ierror)
else
call mpi_recv(inputRank,1,MPI_INTEGER,recvFrom,myTag, &
MPI_COMM_WORLD,recv_status,ierror)
call mpi_send(myRank,1,MPI_INTEGER,sendTo, myTag, &
MPI_COMM_WORLD,ierror)
endif
write(*,*) 'i am #',myRank,' and I received a new
rank=',inputRank
CALL MPI_FINALIZE(ierror)
END PROGRAM ring
~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ mpif90 -O0
mkb_ring_solution_send_and_recv_portable.f90;mpif90 -show
ifort: Command line warning: overriding '-O3' with '-O0'
ifort -O3 -I/opt/mpi/mpibull2-0.9.7-2.t_RC4v4.3/include
-L/home/horace/mccssmb2/.mpibull2/lib
-L/opt/mpi/mpibull2-0.9.7-2.t_RC4v4.3/lib -lmpidev -lmpi -lrt -ldl
-lelan -lelanctrl -lcpuset
~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ date;prun -O -n 5 -p
login ./a.out
Sun Feb 15 19:06:56 GMT 2009
Using qxelan driver, build for MPIBull2 0.9.7-t (Ishtar) 20060726-1607
i am # 1 and I received a new rank= 0
i am # 2 and I received a new rank= 1
i am # 3 and I received a new rank= 2
i am # 4 and I received a new rank= 3
(CNTL-C)
~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ date
Sun Feb 15 19:23:47 GMT 2009
~/RCS/myCourses/Intro_to_MPI/MPI_Intro_exercises$ prun -I -n 5 -p
login ./sync.out
Using qxelan driver, build for MPIBull2 0.9.7-t (Ishtar) 20060726-1607
i am # 1 and I received a new rank= 0
i am # 2 and I received a new rank= 1
[prun -O means over-commit if required, -I means no over-commit but
fail
without running if insufficient resources]