f***@gmail.com
2010-04-21 05:58:08 UTC
Dear MPI programmers,
I'm a beginner to MPI programming (in C). I studied some
of the basic MPI functions - MPI_Init, MPI_Comm_size, MPI_Comm_rank,
MPI_Send ...
As a startup, I wrote the following program:
#include<stdio.h>
#include<mpi.h>
int main(int argc, char *argv)
{
int my_rank,tot_rank;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &tot_rank);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
if (my_rank==0)
{
int a=10,*recsum;
MPI_Status s0;
MPI_Send(&a,1,MPI_INT,1,100,MPI_COMM_WORLD);
MPI_Recv(*recsum,1,MPI_INT,0,200,MPI_COMM_WORLD,&s0);
printf("\n Received sum from process 1: %d", *recsum);
}
else
{
int *getsum,totsum;
MPI_Status s1;
MPI_Recv(*getsum,1,MPI_INT,0,100,MPI_COMM_WORLD, &s1);
totsum=*getsum+10;
MPI_Send(&totsum,1,MPI_INT,0,200,MPI_COMM_WORLD);
}
MPI_Finalize();
return 0;
}
The program is very simple. Its made to run for 2 mpi processes (-np
2). The 0th process sends an integer number to process 1. process 1
adds 10 to the received number and sends the resulting sum to process
0.
Compilation goes smooth, But during execution it seg-faults:
$ mpirun -np 2 ./sammpi1
[localhost:03026] *** Process received signal ***
[localhost:03023] *** Process received signal ***
[localhost:03023] Signal: Segmentation fault (11)
[localhost:03023] Signal code: Address not mapped (1)
[localhost:03023] Failing at address: (nil)
[localhost:03023] [ 0] /lib64/libpthread.so.0 [0x396f20eee0]
[localhost:03023] [ 1] ./sammpi1(main+0xbe) [0x401e2e]
[localhost:03023] [ 2] /lib64/libc.so.6(__libc_start_main+0xfd)
[0x396e61ea2d]
[localhost:03023] [ 3] ./sammpi1 [0x401ca9]
[localhost:03023] *** End of error message ***
[localhost:03026] Signal: Segmentation fault (11)
[localhost:03026] Signal code: Address not mapped (1)
[localhost:03026] Failing at address: (nil)
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 3023 on node
localhost.localdomain exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
[localhost:03026] [ 0] /lib64/libpthread.so.0 [0x396f20eee0]
[localhost:03026] [ 1] ./sammpi1(main+0x104) [0x401e74]
[localhost:03026] [ 2] /lib64/libc.so.6(__libc_start_main+0xfd)
[0x396e61ea2d]
[localhost:03026] [ 3] ./sammpi1 [0x401ca9]
[localhost:03026] *** End of error message ***
I understand the reason for this error is, send or receive operations
does not wait for each other in buffering the message. This caused the
program to seg-fault.
Can some one throw light on how this issue can be addressed?
Thanks in advance
I'm a beginner to MPI programming (in C). I studied some
of the basic MPI functions - MPI_Init, MPI_Comm_size, MPI_Comm_rank,
MPI_Send ...
As a startup, I wrote the following program:
#include<stdio.h>
#include<mpi.h>
int main(int argc, char *argv)
{
int my_rank,tot_rank;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &tot_rank);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
if (my_rank==0)
{
int a=10,*recsum;
MPI_Status s0;
MPI_Send(&a,1,MPI_INT,1,100,MPI_COMM_WORLD);
MPI_Recv(*recsum,1,MPI_INT,0,200,MPI_COMM_WORLD,&s0);
printf("\n Received sum from process 1: %d", *recsum);
}
else
{
int *getsum,totsum;
MPI_Status s1;
MPI_Recv(*getsum,1,MPI_INT,0,100,MPI_COMM_WORLD, &s1);
totsum=*getsum+10;
MPI_Send(&totsum,1,MPI_INT,0,200,MPI_COMM_WORLD);
}
MPI_Finalize();
return 0;
}
The program is very simple. Its made to run for 2 mpi processes (-np
2). The 0th process sends an integer number to process 1. process 1
adds 10 to the received number and sends the resulting sum to process
0.
Compilation goes smooth, But during execution it seg-faults:
$ mpirun -np 2 ./sammpi1
[localhost:03026] *** Process received signal ***
[localhost:03023] *** Process received signal ***
[localhost:03023] Signal: Segmentation fault (11)
[localhost:03023] Signal code: Address not mapped (1)
[localhost:03023] Failing at address: (nil)
[localhost:03023] [ 0] /lib64/libpthread.so.0 [0x396f20eee0]
[localhost:03023] [ 1] ./sammpi1(main+0xbe) [0x401e2e]
[localhost:03023] [ 2] /lib64/libc.so.6(__libc_start_main+0xfd)
[0x396e61ea2d]
[localhost:03023] [ 3] ./sammpi1 [0x401ca9]
[localhost:03023] *** End of error message ***
[localhost:03026] Signal: Segmentation fault (11)
[localhost:03026] Signal code: Address not mapped (1)
[localhost:03026] Failing at address: (nil)
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 3023 on node
localhost.localdomain exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
[localhost:03026] [ 0] /lib64/libpthread.so.0 [0x396f20eee0]
[localhost:03026] [ 1] ./sammpi1(main+0x104) [0x401e74]
[localhost:03026] [ 2] /lib64/libc.so.6(__libc_start_main+0xfd)
[0x396e61ea2d]
[localhost:03026] [ 3] ./sammpi1 [0x401ca9]
[localhost:03026] *** End of error message ***
I understand the reason for this error is, send or receive operations
does not wait for each other in buffering the message. This caused the
program to seg-fault.
Can some one throw light on how this issue can be addressed?
Thanks in advance