Discussion:
Sending std::vector object within MPI derived datatype
(too old to reply)
p***@gmail.com
2007-12-18 21:06:53 UTC
Permalink
I have written the simple code below in an attempt to figure out a bug
in a larger code. This code attempts to complete a ghost cell exchange
between at least 3 processes (assumes non-periodic boundary
conditions). I believe that the problem lies in including a
std::vector object in an MPI derived datatype. I usually get two types
of errors when I run this code. The first is a crash with the SIGSEGV
signal and the other actually runs through the code but there ends up
being no transfer of data between the processes. When I run it with
DDT (Distributed Debugging Tool) it appears to hang on the message
transfer between node=0 and node=1.

OS -> RedHat Enterprise 4.0
MPI and compiler -> mpich-ch_p4-gcc-1.2.7 (standard with OSCAR which
is what our cluster is built with)

HEADER FILE
//std_vector_web_question.h
#ifndef STD_VECTOR_WEB_QUESTION_H
#define STD_VECTOR_WEB_QUESTION_H

#include <iostream>
#include <fstream>
#include <vector>
#include <string>

using namespace std;

class vect
{
private:

std::vector<int> a;
std::vector<int>::iterator it;
ofstream output;

public:

vect(){}
~vect(){}

void add(const int n){
int i;
for(i=0;i<n;i++)
a.push_back(0);
}
void fill(const int start){
int j=start;
for(it=a.begin();it<a.end();it++){
*it=j;
j++;
}
}
void display(string mes,int node){
output.open("data.txt", ios::out | ios::app);
output << endl << mes << " a vector on process " << node
<< endl;
for(it=a.begin();it<a.end();it++){
output << *it << " ";
}
output.close();
}
};

#endif


SOURCE FILE
//std_vector_web_question.cc
#include <iostream>
#include <vector>
#include <string>
#include "std_vector_web_question.h"
#include "mpi.h"

using namespace std;

int main(int argc,char **argv)
{
int numnodes,mynode;
const int N=5;

MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numnodes);
MPI_Comm_rank(MPI_COMM_WORLD,&mynode);

vect wsend,esend,wrecv,erecv; //four vectors for passing data
between processes

int eastbor,westbor;
//Get the neighbor process id's
if(mynode==0){
eastbor=9999; //placeholder
westbor=mynode+1;
}
else if(mynode==numnodes-1){
eastbor=mynode-1;
westbor=9999; //placeholder
}
else{
eastbor=mynode-1;
westbor=mynode+1;
}
//set up the vectors in the object and fill in the data
wsend.add(N);
wrecv.add(N);
esend.add(N);
erecv.add(N);
wsend.fill(N);
wrecv.fill(0);
esend.fill(N);
erecv.fill(0);
//now set up the MPI Datatype information
MPI_Datatype onevec;
MPI_Type_contiguous(N,MPI_INT,&onevec);
MPI_Type_commit(&onevec);

//now send data to the neighbors looking at present and after send
values
if(mynode==0){
wsend.display("wsend before",mynode);
wrecv.display("wrecv before",mynode);
}
else if(mynode==(numnodes-1)){
esend.display("esend before",mynode);
erecv.display("erecv before",mynode);
}
else{
wsend.display("wsend before",mynode);
wrecv.display("wrecv before",mynode);
esend.display("esend before",mynode);
erecv.display("erecv before",mynode);
}

MPI_Status status;
if(mynode!=0 && mynode!=(numnodes-1)){
MPI_Sendrecv(&wsend,1,onevec,westbor,2,&wrecv,1,onevec,westbor,
1,MPI_COMM_WORLD,&status);
MPI_Sendrecv(&esend,1,onevec,eastbor,3,&erecv,1,onevec,eastbor,
4,MPI_COMM_WORLD,&status);
}
else if(mynode==0){
MPI_Sendrecv(&wsend,1,onevec,westbor,4,&wrecv,1,onevec,westbor,
3,MPI_COMM_WORLD,&status);
}
else if(mynode==(numnodes-1)){
MPI_Sendrecv(&esend,1,onevec,eastbor,1,&erecv,1,onevec,eastbor,
2,MPI_COMM_WORLD,&status);
}

if(mynode==0){
wsend.display("wsend after",mynode);
wrecv.display("wrecv after",mynode);
}
else if(mynode==(numnodes-1)){
esend.display("esend after",mynode);
erecv.display("erecv after",mynode);
}
else{
wsend.display("wsend after",mynode);
wrecv.display("wrecv after",mynode);
esend.display("esend after",mynode);
erecv.display("erecv after",mynode);
}
//now clean up
MPI_Type_free(&onevec);

MPI_Finalize();

cout << "\nAll done!\n";

return 0;
}



MAKEFILE
FINALOBJECTS=std_vector_web_question.o

vec : $(FINALOBJECTS)
mpiCC -g $(FINALOBJECTS) -lm -o vec

std_vector_web_question.o : std_vector_web_question.cc
std_vector_web_question.h
mpiCC -Wall -g -c std_vector_web_question.cc


Does this work for anyone else? If not can you explain where it is
going wrong? Thanks.

-Darcy
Michael Hofmann
2007-12-19 08:34:18 UTC
Permalink
Hi Plasma, (SNCR)
Post by p***@gmail.com
I believe that the problem lies in including a
std::vector object in an MPI derived datatype.
No, the problem is that you submit a pointer to an object (vect) to MPI_Send/MPI_Recv. But, what you need to submit is pointer to an interger arrays. Even though, there exist C++ bindungs for MPI, MPI is still made to work with arrays and pointers and not with objects and references.


A (nasty) hack resolving you problem looks like this:

Add the following public method to class "vect" in "std_vector_web_question.h":

int *gimme_pointer_to_a_vector() { return &a[0]; }

Use this methode to gain direct access (a pointer) to the integer array hidden in class "vect":

MPI_Sendrecv(wsend.gimme_pointer_to_a_vector(),...,wrecv.gimme_pointer_to_a_vector(),...);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If "a" is a public member of class "vect", you can omit the additional method (gimme_pointer_...) and use things like &wsend.a[0], &wrecv.a[0], ... directly.


Michael
p***@gmail.com
2008-01-02 20:15:55 UTC
Permalink
Post by Michael Hofmann
Hi Plasma, (SNCR)
Post by p***@gmail.com
I believe that the problem lies in including a
std::vector object in an MPI derived datatype.
No, the problem is that you submit a pointer to an object (vect) to MPI_Send/MPI_Recv. But, what you need to submit is pointer to an interger arrays. Even though, there exist C++ bindungs for MPI, MPI is still made to work with arrays and pointers and not with objects and references.
int *gimme_pointer_to_a_vector() { return &a[0]; }
MPI_Sendrecv(wsend.gimme_pointer_to_a_vector(),...,wrecv.gimme_pointer_to_a_vector(),...);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If "a" is a public member of class "vect", you can omit the additional method (gimme_pointer_...) and use things like &wsend.a[0], &wrecv.a[0], ... directly.
Michael
Michael,

Now that I am back to my parallel computer after a Christmas break I
have found that the code works after I fixed it up. I added this
public method to my class like you suggested:

inline int *pointer_to_vec() { return &a[0]; }

I need to define the vector as a private member because of the
constraints involved in the larger code I mentioned. If I think of a
better way to solve this (nothing immediately comes to mind) I'll post
it to this thread. Thanks again for your help!

-Darcy
p***@gmail.com
2008-01-19 00:44:20 UTC
Permalink
Post by p***@gmail.com
Post by Michael Hofmann
Hi Plasma, (SNCR)
Post by p***@gmail.com
I believe that the problem lies in including a
std::vector object in an MPI derived datatype.
No, the problem is that you submit a pointer to an object (vect) to MPI_Send/MPI_Recv. But, what you need to submit is pointer to an interger arrays. Even though, there exist C++ bindungs for MPI, MPI is still made to work with arrays and pointers and not with objects and references.
int *gimme_pointer_to_a_vector() { return &a[0]; }
MPI_Sendrecv(wsend.gimme_pointer_to_a_vector(),...,wrecv.gimme_pointer_to_a_vector(),...);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If "a" is a public member of class "vect", you can omit the additional method (gimme_pointer_...) and use things like &wsend.a[0], &wrecv.a[0], ... directly.
Michael
Michael,
Now that I am back to my parallel computer after a Christmas break I
have found that the code works after I fixed it up. I added this
inline int *pointer_to_vec() { return &a[0]; }
I need to define the vector as a private member because of the
constraints involved in the larger code I mentioned. If I think of a
better way to solve this (nothing immediately comes to mind) I'll post
it to this thread. Thanks again for your help!
-Darcy
Here is something further that is not very clear with MPI and
std::vector containers. The above solution method works well if you
want to send one std::vector of one type. Suppose you had a class that
had the following private data members:

class foo
{
private:
double a,b,c,d,e;
int f,g,h,i,j;
std::vector<T1> k;
std::vector<T2> l;

public:
...
}

If the std::vectors were just regular arrays it would be easy to use
the MPI_Type_struct to make a derived datatype for this class but
since the vectors need a pointer (like was already discussed) it isn't
straight forward what to do. I'm not as familiar with the inner
workings of MPI_Type_struct as I should be but from experimentation
MPI doesn't know what to do when I try to make a derived datatype with
a similar class as above.

The only way around it that I could think of was to send three
different messages. One for the basic datatypes and one each for the
two vectors of type T1 and T2. This gets very messy when you want to
send an array of the class datatype with MPI. Is it possible to make a
struct derived datatype of the class foo above so you can pass an
array of them with MPI? I guess I could also just make the
std::vectors regular arrays but I don't want to sacrifice the benefits
of the std container.

Thanks for any thoughts or input on this.

-Darcy
Michael Hofmann
2008-01-21 16:40:00 UTC
Permalink
Post by p***@gmail.com
The only way around it that I could think of was to send three
different messages. One for the basic datatypes and one each for the
two vectors of type T1 and T2.
Yes.
Post by p***@gmail.com
This gets very messy when you want to
send an array of the class datatype with MPI. Is it possible to make a
struct derived datatype of the class foo above so you can pass an
array of them with MPI?
No, that's not possible. But, ...

- You can create a struct derived datatype that covers the basic datatypes, when you want to send an array of the class. But all the std::vector object stills require separate send/recv operations.

- You can implement methods to serialize/deserialize the class or an array of the class and send/recv the serialized data with single operations.

- *VERY MESSY* You can create a separate struct datatype for each class object (use MPI_Get_address to obtain the offsets) and another (envelope) struct datatype that carries all the objects of the array. (Or you use only one (envelope) struct datatype that covers all members of all objects in the array.) Then you have to use MPI_BOTTOM as send/recv buffer. The messy thing is, that you have to create the derived datatype for every class object or array of class objects and it is very doubtful whether there is a benefit from using only one send/recv operation or multiple.


Michael
p***@gmail.com
2008-01-21 18:48:26 UTC
Permalink
Michael,
Post by Michael Hofmann
No, that's not possible. But, ...
I was hoping you would tell me how wrong I was and that there was some
way to make an MPI derived datatype that included more than one
std::vector. Oh, well.
Post by Michael Hofmann
- You can create a struct derived datatype that covers the basic datatypes, when you want to send an array of the class. But all the std::vector object stills require separate send/recv operations.
- You can implement methods to serialize/deserialize the class or an array of the class and send/recv the serialized data with single operations.
- *VERY MESSY* You can create a separate struct datatype for each class object (use MPI_Get_address to obtain the offsets) and another (envelope) struct datatype that carries all the objects of the array. (Or you use only one (envelope) struct datatype that covers all members of all objects in the array.) Then you have to use MPI_BOTTOM as send/recv buffer. The messy thing is, that you have to create the derived datatype for every class object or array of class objects and it is very doubtful whether there is a benefit from using only one send/recv operation or multiple.
The serialize/deserialize option seems to be the best choice but I am
not familiar with this operation. I have seen a boost library that
does serialization <http://www.boost.org/libs/serialization/doc/
index.html> but I was wondering if you had any suggestions of where to
start to learn how to do this? Thanks.

-Darcy
Michael Hofmann
2008-01-22 21:07:37 UTC
Permalink
Post by p***@gmail.com
The serialize/deserialize option seems to be the best choice but I am
not familiar with this operation. I have seen a boost library that
does serialization <http://www.boost.org/libs/serialization/doc/
index.html> but I was wondering if you had any suggestions of where to
start to learn how to do this? Thanks.
"... the term "serialization" ... mean[s] the reversible deconstruction of an arbitrary set of C++ data structures to a sequence of bytes."

From the boost example, I haven't found out what's the real benefit of using the boost serialization library. If your application is as simple as the foo class (or the array of foo classes) example, serialization can be done straight forward:

1. determine the number of bytes required to store the serialized object
2. allocate a buffer with the required number of bytes
3. write all data one after another to the buffer

To serialize arrays/vectors, you can the number of elements first, followed by the elements.

Deserialization is same (vice versa). Since you know, what the buffer contains, you can read back all data one after another. If you have to read an array/vector, you read the number of elements first and then the appropriate number of elements.

And if you are an experienced C++ programmer, you do the whole work with streams and overloaded operators (<<, >>).


Michael
grsobhani
2008-03-06 08:26:10 UTC
Permalink
Hi everyone,
I have a question about using MPICH2. I wrote a simple program to
describe my question. I want to run a member function of a class which
is stored in a vector.(If you look at the bellow code you will
understand what I mean.) I want to run "MakeMe" on different nodes on
cluster in parallel. My question is how can I correct this program
somehow that I can run "MakeMe" in parallel. And is it possible to use
MPI to deal with this issue?

I really appreciate your help.

ReZa

#include "mpi.h"
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
#include<sstream>

using namespace std;


class MKFILE
{
public:
MKFILE(){}
~MKFILE(){}
void MakeMe(int fn);
};


void MKFILE::MakeMe(int fn)
{
ofstream out;

string s;
stringstream sout;
sout << fn;
s = sout.str();
out.open(s.c_str());

for(register int i=0; i<100;i++)
{
out << i+1 << endl;
}
out.close();

}

int main(int argc,char **argv)
{
int id,procid;
ofstream out("log.log",ios::app);

vector<MKFILE> pfile(100);
int k=0;

MPI::Init(argc,argv);
procid =MPI::COMM_WORLD.Get_size();
id =MPI::COMM_WORLD.Get_rank();

for(vector<MKFILE>::iterator i=pfile.begin();i!=pfile.end();i++)
{
out << "My ID : " << id << " No. Proc: " << procid << endl;
i->MakeMe(k++);
}

MPI::Finalize();
pfile.clear();

return 0;

}
Michael Hofmann
2008-03-07 07:59:00 UTC
Permalink
Post by grsobhani
I have a question about using MPICH2. I wrote a simple program to
describe my question. I want to run a member function of a class which=
is stored in a vector.(If you look at the bellow code you will
understand what I mean.) I want to run "MakeMe" on different nodes on
cluster in parallel. My question is how can I correct this program
somehow that I can run "MakeMe" in parallel. And is it possible to use=
MPI to deal with this issue?
Sure, it is. If you insert

k =3D id * pfile.size() / procid;
pfile.resize(((int) (id + 1) * pfile.size() / procid) - k);

in front of the for-loop, the 100 files are created "in parallel" by the=
member function using multiple MPI processes.

Additionally, you have to distribute the elements of your vector to the =
MPI processes in advance. But, this depends on what the elements are and=
where they are coming from (not shown in the simple program).


Michael
p***@gmail.com
2008-04-04 01:04:56 UTC
Permalink
try http://tpo.sourceforge.net/
Post by Michael Hofmann
Post by grsobhani
I have a question about using MPICH2. I wrote a simple program to
describe my question. I want to run a member function of a class which
is stored in a vector.(If you look at the bellow code you will
understand what I mean.) I want to run "MakeMe" on different nodes on
cluster in parallel. My question is how can I correct this program
somehow that I can run "MakeMe" in parallel. And is it possible to use
MPI to deal with this issue?
Sure, it is. If you insert
k = id * pfile.size() / procid;
pfile.resize(((int) (id + 1) * pfile.size() / procid) - k);
in front of the for-loop, the 100 files are created "in parallel" by the member function using multiple MPI processes.
Additionally, you have to distribute the elements of your vector to the MPI processes in advance. But, this depends on what the elements are and where they are coming from (not shown in the simple program).
Michael
Loading...