Discussion:
Simple MPI program not working
(too old to reply)
f***@gmail.com
2010-04-21 05:58:08 UTC
Permalink
Dear MPI programmers,

I'm a beginner to MPI programming (in C). I studied some
of the basic MPI functions - MPI_Init, MPI_Comm_size, MPI_Comm_rank,
MPI_Send ...
As a startup, I wrote the following program:

#include<stdio.h>
#include<mpi.h>
int main(int argc, char *argv)
{
int my_rank,tot_rank;

MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &tot_rank);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

if (my_rank==0)
{
int a=10,*recsum;
MPI_Status s0;
MPI_Send(&a,1,MPI_INT,1,100,MPI_COMM_WORLD);
MPI_Recv(*recsum,1,MPI_INT,0,200,MPI_COMM_WORLD,&s0);
printf("\n Received sum from process 1: %d", *recsum);
}
else
{
int *getsum,totsum;
MPI_Status s1;
MPI_Recv(*getsum,1,MPI_INT,0,100,MPI_COMM_WORLD, &s1);
totsum=*getsum+10;
MPI_Send(&totsum,1,MPI_INT,0,200,MPI_COMM_WORLD);
}
MPI_Finalize();
return 0;
}

The program is very simple. Its made to run for 2 mpi processes (-np
2). The 0th process sends an integer number to process 1. process 1
adds 10 to the received number and sends the resulting sum to process
0.
Compilation goes smooth, But during execution it seg-faults:

$ mpirun -np 2 ./sammpi1
[localhost:03026] *** Process received signal ***
[localhost:03023] *** Process received signal ***
[localhost:03023] Signal: Segmentation fault (11)
[localhost:03023] Signal code: Address not mapped (1)
[localhost:03023] Failing at address: (nil)
[localhost:03023] [ 0] /lib64/libpthread.so.0 [0x396f20eee0]
[localhost:03023] [ 1] ./sammpi1(main+0xbe) [0x401e2e]
[localhost:03023] [ 2] /lib64/libc.so.6(__libc_start_main+0xfd)
[0x396e61ea2d]
[localhost:03023] [ 3] ./sammpi1 [0x401ca9]
[localhost:03023] *** End of error message ***
[localhost:03026] Signal: Segmentation fault (11)
[localhost:03026] Signal code: Address not mapped (1)
[localhost:03026] Failing at address: (nil)
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 3023 on node
localhost.localdomain exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
[localhost:03026] [ 0] /lib64/libpthread.so.0 [0x396f20eee0]
[localhost:03026] [ 1] ./sammpi1(main+0x104) [0x401e74]
[localhost:03026] [ 2] /lib64/libc.so.6(__libc_start_main+0xfd)
[0x396e61ea2d]
[localhost:03026] [ 3] ./sammpi1 [0x401ca9]
[localhost:03026] *** End of error message ***

I understand the reason for this error is, send or receive operations
does not wait for each other in buffering the message. This caused the
program to seg-fault.

Can some one throw light on how this issue can be addressed?

Thanks in advance
Michael Hofmann
2010-04-21 06:53:21 UTC
Permalink
Post by f***@gmail.com
Can some one throw light on how this issue can be addressed?
Try switching on some warning messages in your compiler (e.g. "-Wall" with
gcc). You should get a lot of warnings in this small piece of code.
Post by f***@gmail.com
#include<stdio.h>
#include<mpi.h>
int main(int argc, char *argv)
^^^^^
This should be "**argv".
Post by f***@gmail.com
{
int my_rank,tot_rank;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &tot_rank);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
if (my_rank==0)
{
int a=10,*recsum;
^^^^^^^
_You_ need to allocate the receive buffer, but this is just an
uninitialized pointer. Change "*recsum" to "recsum".
Post by f***@gmail.com
MPI_Status s0;
MPI_Send(&a,1,MPI_INT,1,100,MPI_COMM_WORLD);
MPI_Recv(*recsum,1,MPI_INT,0,200,MPI_COMM_WORLD,&s0);
^^^^^^^ ^
Here is your segfault, because you try to dereference an uninitialized
pointer. You need to submit a pointer to the receive buffer (change
"*recsum" to "&recsum").

The source for the receive should be process 1!
Post by f***@gmail.com
printf("\n Received sum from process 1: %d", *recsum);
^^^^^^^
Now, this should be "recsum" (without *).
Post by f***@gmail.com
}
else
{
int *getsum,totsum;
^^^^^^^
Post by f***@gmail.com
MPI_Status s1;
MPI_Recv(*getsum,1,MPI_INT,0,100,MPI_COMM_WORLD, &s1);
^^^^^^^
Post by f***@gmail.com
totsum=*getsum+10;
^^^^^^^
The same problem as above.


Michael
f***@gmail.com
2010-04-21 12:27:28 UTC
Permalink
My understanding was wrong. Thanks for correcting it.

I'm using Intel C compiler. There were warnings during compilation,
but I ignored.

The MPI_Send & Receive functions (http://www.mpi-forum.org/docs/
mpi-11-html/node31.html#Node31)

int MPI_Send(void* buf, int count, MPI_Datatype datatype, int dest,
int tag, MPI_Comm comm)
int MPI_Recv(void* buf, int count, MPI_Datatype datatype, int source,
int tag, MPI_Comm comm, MPI_Status *status)

The 1st argument of MPI_Send contains address of sending argument, So
I thought 1st arg of MPI_Recv should be a pointer to receive.

And, what is the purpose of parameters int tag and MPI_Status in the
respective operations?

Thank you
Post by f***@gmail.com
Can some one throw light on how this issue can be addressed?
Try switching on some warning messages in your compiler (e.g. "-Wall" with  
gcc). You should get a lot of warnings in this small piece of code.
Post by f***@gmail.com
#include<stdio.h>
#include<mpi.h>
int main(int argc, char *argv)
                           ^^^^^
This should be "**argv".
Post by f***@gmail.com
{
int my_rank,tot_rank;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &tot_rank);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
if (my_rank==0)
{
int a=10,*recsum;
            ^^^^^^^
_You_ need to allocate the receive buffer, but this is just an  
uninitialized pointer. Change "*recsum" to "recsum".
Post by f***@gmail.com
MPI_Status s0;
MPI_Send(&a,1,MPI_INT,1,100,MPI_COMM_WORLD);
MPI_Recv(*recsum,1,MPI_INT,0,200,MPI_COMM_WORLD,&s0);
            ^^^^^^^           ^
Here is your segfault, because you try to dereference an uninitialized  
pointer. You need to submit a pointer to the receive buffer (change  
"*recsum" to "&recsum").
The source for the receive should be process 1!
Post by f***@gmail.com
printf("\n Received sum from process 1: %d", *recsum);
                                                ^^^^^^^
Now, this should be "recsum" (without *).
Post by f***@gmail.com
}
else
{
int *getsum,totsum;
       ^^^^^^^
Post by f***@gmail.com
MPI_Status s1;
MPI_Recv(*getsum,1,MPI_INT,0,100,MPI_COMM_WORLD, &s1);
            ^^^^^^^
Post by f***@gmail.com
totsum=*getsum+10;
          ^^^^^^^
The same problem as above.
Michael
Michael Hofmann
2010-04-22 08:30:50 UTC
Permalink
Post by f***@gmail.com
My understanding was wrong. Thanks for correcting it.
I'm using Intel C compiler. There were warnings during compilation,
but I ignored.
The MPI_Send & Receive functions (http://www.mpi-forum.org/docs/
mpi-11-html/node31.html#Node31)
int MPI_Send(void* buf, int count, MPI_Datatype datatype, int dest,
int tag, MPI_Comm comm)
int MPI_Recv(void* buf, int count, MPI_Datatype datatype, int source,
int tag, MPI_Comm comm, MPI_Status *status)
The 1st argument of MPI_Send contains address of sending argument, So
I thought 1st arg of MPI_Recv should be a pointer to receive.
No, an "address" belongs to the address space of a single process.
Therefore, it doesn't make sense to send address values to other process.
In both cases, "buf" is an address to the memory location where the data
to sent or received is stored (or should be stored).
Post by f***@gmail.com
And, what is the purpose of parameters int tag and MPI_Status in the
respective operations?
"tag" can be used to distinguish messages that come from the same process.
Messages from the same process and with the same tag can be received only
in the same order as they were sent. If different tags are used, these
messages can be received "out-of-order" by specifying the tag of the
requested messages in the receive operation.

The "status" can be used to determine some properties of the received
message, e.g.:
- the "real" number of elements received ("count" in MPI_Recv specifies
only the maximum number of elements that can be received!)
- the source process of the message, useful if MPI_ANY_SOURCE was used for
the receive operation
- the tag of the message, use if MPI_ANY_TAG was used for the receive
operation


Michael

f***@gmail.com
2010-04-21 12:30:40 UTC
Permalink
My understanding is wrong. Thanks for correcting it.

I'm using Intel C compiler. There were warnings during compilation,
but I ignored.

The MPI_Send & Receive functions (http://www.mpi-forum.org/docs/
mpi-11-html/node31.html#Node31)

int MPI_Send(void* buf, int count, MPI_Datatype datatype, int dest,
int tag, MPI_Comm comm)
int MPI_Recv(void* buf, int count, MPI_Datatype datatype, int source,
int tag, MPI_Comm comm, MPI_Status *status)

The 1st argument of MPI_Send contains address of sending argument, So
I thought 1st arg of MPI_Recv should be a pointer to receive.

And, what is the purpose of parameters int tag and MPI_Status *status
in the respective operations?

Thank you
Post by f***@gmail.com
Can some one throw light on how this issue can be addressed?
Try switching on some warning messages in your compiler (e.g. "-Wall" with  
gcc). You should get a lot of warnings in this small piece of code.
Post by f***@gmail.com
#include<stdio.h>
#include<mpi.h>
int main(int argc, char *argv)
                           ^^^^^
This should be "**argv".
Post by f***@gmail.com
{
int my_rank,tot_rank;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &tot_rank);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
if (my_rank==0)
{
int a=10,*recsum;
            ^^^^^^^
_You_ need to allocate the receive buffer, but this is just an  
uninitialized pointer. Change "*recsum" to "recsum".
Post by f***@gmail.com
MPI_Status s0;
MPI_Send(&a,1,MPI_INT,1,100,MPI_COMM_WORLD);
MPI_Recv(*recsum,1,MPI_INT,0,200,MPI_COMM_WORLD,&s0);
            ^^^^^^^           ^
Here is your segfault, because you try to dereference an uninitialized  
pointer. You need to submit a pointer to the receive buffer (change  
"*recsum" to "&recsum").
The source for the receive should be process 1!
Post by f***@gmail.com
printf("\n Received sum from process 1: %d", *recsum);
                                                ^^^^^^^
Now, this should be "recsum" (without *).
Post by f***@gmail.com
}
else
{
int *getsum,totsum;
       ^^^^^^^
Post by f***@gmail.com
MPI_Status s1;
MPI_Recv(*getsum,1,MPI_INT,0,100,MPI_COMM_WORLD, &s1);
            ^^^^^^^
Post by f***@gmail.com
totsum=*getsum+10;
          ^^^^^^^
The same problem as above.
Michael
Loading...