Ryan
2008-10-31 17:36:45 UTC
Hello all,
I am new to MPI programming. I am currently paralllelizing an
algorithm on a 128 node linux cluster. (all C++ development)
The original program received its input from 2 files ranging from
5-20kbytes in length. I used ifstream to do this. It reads through
the files and dumps in a lot of data into a C++ class that i have
written. This class has a large array of doubles and some other data
members. The reason I am including this is because it is a very
complex object, and i know that if i want to pass it through MPI_Send
i will have to account for all of the data members.
I would like to know what do do for the parallelized version (ie what
are best practices for performance.) As it stands right now, every
single worker node opens the input files through ifstream and creates
its own object. I feel like this is very ineffecient.
Since every process is only reading from the file, does this cause a
performance issue? Would it be better to have the head node read in
the file and then pass the object to every worker node? Should I be
using MPI_File_read?
I appreciate your help in this. I am looking forward to hear your
input! Thanks.
Ryan
I am new to MPI programming. I am currently paralllelizing an
algorithm on a 128 node linux cluster. (all C++ development)
The original program received its input from 2 files ranging from
5-20kbytes in length. I used ifstream to do this. It reads through
the files and dumps in a lot of data into a C++ class that i have
written. This class has a large array of doubles and some other data
members. The reason I am including this is because it is a very
complex object, and i know that if i want to pass it through MPI_Send
i will have to account for all of the data members.
I would like to know what do do for the parallelized version (ie what
are best practices for performance.) As it stands right now, every
single worker node opens the input files through ifstream and creates
its own object. I feel like this is very ineffecient.
Since every process is only reading from the file, does this cause a
performance issue? Would it be better to have the head node read in
the file and then pass the object to every worker node? Should I be
using MPI_File_read?
I appreciate your help in this. I am looking forward to hear your
input! Thanks.
Ryan