speed up this problem by MPI

b***@myrealbox.com

2010-02-05 16:35:02 UTC

In article <f3832e12-0acb-4d83-98f1-***@h33g2000vbr.googlegroups.com>,
Tim <***@yahoo.com> wrote:

Okay, no one else is replying, so I'll give it a try ....

Post by Tim
Hi,
(1). I am wondering how I can speed up the time-consuming computation
in the loop of my code below using MPI?
[CODE]
int main(int argc, char ** argv)
{
// some operations
f(size);
// some operations
return 0;
}
void f(int size)
{
// some operations
int i;
double * array = new double [size];
for (i = 0; i < size; i++) // how can I use MPI to speed up this loop
to compute all elements in the array?
{
array[i] = complicated_computation(); // time comsuming
computation
}
// some operations using all elements in array
delete [] array;
}
[/CODE]
As shown in the code, I want to do some operations before and after
the part to be paralleled with MPI, but I don't know how to specify
where the parallel part begins and ends.

How much do you know about MPI? It doesn't really have a notion of
"where the parallel part begins and ends" in the same way OpenMP
does. Instead ....

First a caveat. I'm going to talk mostly about pre-2.0 versions
of MPI, since that's what I'm most familiar with. Some of what
I say will not be entirely true of MPI 2.0. I'll try to at least
point out where the differences are.

I'm also going to assume that you aren't familiar with message
passing in general or MPI in particular.

A running MPI program typically consists of a set of processes
all running the same executable, each in its own memory space and
therefore with its own data -- the "Single Program, Multiple Data"
(SPMD) model. (Different processes might take different paths
through the single program.) Processes interact by sending each
other messages rather than via a shared memory space.

(MPI 2.0 supports dynamic process creation, though I suspect it's
not very fast, and allows different processes to run completely
different executables. It also provides something called "one-sided
communication" that supposedly is *sort of* like shared memory,
though not exactly. I don't know enough about it to say more.)

The lack of a shared memory space has some significant
consequences: It means that the programmer has to figure out
whether each process needs a complete copy of any variables that
in an OpenMP program would be shared or whether it makes sense
to split up the data and distribute it among processes.

If all of this is new to you, I recommend doing some reading about
message-passing programming and/or MPI; I doubt that a single
newsgroup post can begin to tell you enough. I'll just make a few

Post by Tim
(2) My current code is using OpenMP to speed up the comutation.
[CODE]
void f(int size)
{
// some operations
int i;
double * array = new double [size];
omp_set_num_threads(_nb_threads);
#pragma omp parallel shared(array) private(i)
{
#pragma omp for schedule(dynamic) nowait
for (i = 0; i < size; i++) // how can I use MPI to speed up this loop
to compute all elements in the array?
{
array[i] = complicated_computation(); // time comsuming
computation
}
}
// some operations using all elements in array
}
[/CODE]

How to achieve something similar in MPI .... As in OpenMP,
the usual way to parallelize a loop in which the iterations are
independent (as I'm guessing they are here) is to split up the
iterations among processes. The lack of a shared memory space,
though, means that you have to think about what data is used in
complicated_computation() and whether it will be duplicated in
each process or distributed in some way, and if the latter, how
to arrange the necessary communication. Also as in OpenMP, one
does need to think about whether the amount of computation is the
same for each element or can vary. Your use of "schedule(dynamic)"
suggests to me that for your program it's the latter. Dynamically
assigning work to processes is possibly in MPI, but with more
effort than in OpenMP.

Post by Tim
I wonder if I change to use MPI, is it possible to have the code
written both for OpenMP and MPI? If it is possible, how to write the
code and how to compile and run the code?

I'm not sure what you mean by "have the code written for both OpenMP
and MPI" -- are you thinking maybe that the same code could be used
for one or the other, or that you want to combine both (e.g., so it
will run nicely on a cluster of multicore machines)? The latter is
certainly possible, but the former -- I don't think it's feasible,
because the two programming environments are based on different
models (shared memory versus message passing).

Post by Tim
(3) Our cluster has three versions of MPI: mvapich-1.0.1,
mvapich2-1.0.3, openmpi-1.2.6. Are their usage same? Especially in my
case. Which one is best for me to use?

Assuming all of them support the same version of MPI (1.1, 2.0,
etc.), code that compiles and runs with one should compile and run
with the others, with the only differences being performance; the
only difference might be what commands you use to compile and run.
Which one is best would then presumably depend on which one gives
the best performance, and maybe someone local can help you with
that, or you can try them all and compare. If they don't all
support the same version of MPI, then what version of MPI you
want would factor into your decision.

Post by Tim
Thanks and regards!

I hope this is some help!

--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.