Discussion:
MPI File Access
(too old to reply)
Ryan
2008-10-31 17:36:45 UTC
Permalink
Hello all,

I am new to MPI programming. I am currently paralllelizing an
algorithm on a 128 node linux cluster. (all C++ development)


The original program received its input from 2 files ranging from
5-20kbytes in length. I used ifstream to do this. It reads through
the files and dumps in a lot of data into a C++ class that i have
written. This class has a large array of doubles and some other data
members. The reason I am including this is because it is a very
complex object, and i know that if i want to pass it through MPI_Send
i will have to account for all of the data members.

I would like to know what do do for the parallelized version (ie what
are best practices for performance.) As it stands right now, every
single worker node opens the input files through ifstream and creates
its own object. I feel like this is very ineffecient.

Since every process is only reading from the file, does this cause a
performance issue? Would it be better to have the head node read in
the file and then pass the object to every worker node? Should I be
using MPI_File_read?

I appreciate your help in this. I am looking forward to hear your
input! Thanks.

Ryan
Michael Hofmann
2008-11-03 11:25:38 UTC
Permalink
Post by Ryan
I would like to know what do do for the parallelized version (ie what
are best practices for performance.) As it stands right now, every
single worker node opens the input files through ifstream and creates
its own object. I feel like this is very ineffecient.
This is not ineffecient, as long as all worker nodes do it in parallel.
Post by Ryan
Since every process is only reading from the file, does this cause a
performance issue? Would it be better to have the head node read in
the file and then pass the object to every worker node? Should I be
using MPI_File_read?
I appreciate your help in this. I am looking forward to hear your
input! Thanks.
Use the "ifstream" version, as long as there are no performance problems
on your cluster (128 * 2 * 20Kb = 5MB, bandwidth should not be a problem).
If there are problems with the file input you can still try
MPI_File_read[_all].


Michael
Ryan
2008-11-03 17:05:28 UTC
Permalink
Post by Michael Hofmann
Post by Ryan
I would like to know what do do for the parallelized version (ie what
are best practices for performance.)  As it stands right now, every
single worker node opens the input files through ifstream and creates
its own object.  I feel like this is very ineffecient.
This is not ineffecient, as long as all worker nodes do it in parallel.
Post by Ryan
Since every process is only reading from the file, does this cause a
performance issue?  Would it be better to have the head node read in
the file and then pass the object to every worker node?  Should I be
using MPI_File_read?
I appreciate your help in this.  I am looking forward to hear your
input!  Thanks.
Use the "ifstream" version, as long as there are no performance problems  
on your cluster (128 * 2 * 20Kb = 5MB, bandwidth should not be a problem).  
If there are problems with the file input you can still try  
MPI_File_read[_all].
Michael
MIchael,

Thanks for your input. Could you suggest any good tools for testing
the performance of my program? Ie something that could help locate
bottlenecks, ect.

Ryan
Michael Hofmann
2008-11-04 08:43:23 UTC
Permalink
Post by Ryan
MIchael,
Thanks for your input. Could you suggest any good tools for testing
the performance of my program? Ie something that could help locate
bottlenecks, ect.
(For the file input issue) Place some "MPI_Wtime" calls around the file
input code and look how the time changes when going from 1 to 128 nodes.

Otherwise, there are many MPI performance analysis tools, e.g. TAU,
VAMPIR, KOJAK, Paraver, mpiP, IBM HPCT, Intel Trace Analyzer and
Collector, ...


Michael
Ryan
2008-11-05 23:45:29 UTC
Permalink
Post by Ryan
MIchael,
Thanks for your input.  Could you suggest any good tools for testing
the performance of my program?  Ie something that could help locate
bottlenecks, ect.
(For the file input issue) Place some "MPI_Wtime" calls around the file  
input code and look how the time changes when going from 1 to 128 nodes.
Otherwise, there are many MPI performance analysis tools, e.g. TAU,  
VAMPIR, KOJAK, Paraver, mpiP, IBM HPCT, Intel Trace Analyzer and  
Collector, ...
Michael
Thanks for all of your help Michael.

Loading...