Post by Jim HillPost by David CronkI am not sure what you mean by example, but if you mean how to solve
this I will take a shot. First, a little clarification about the data
layout in the file.
Let's say 3 PEs, with 3, 7, and 5 elements each, respectively.
012012012121211
or
012012012_12_12_1__1_ wheere _ means just no data written?
000111111122222
which in theory makes this even easier.
Yes, it sure does.
Post by Jim HillI can do a quick parallel communication so that rank-1 knows there are 3
elements coming before it and rank-2 knows there are 10 elements coming
before it, so they can compute offsets relative to the beginning of the
file and then just *splat* their contents in a continuous chunk
beginning at the offset. I just have to chase down the proper sequence
of MPI_something_something calls to achieve the task.
I suggest MPI_EXSCAN for each PE to figure out how many elements come in
the file before their contribution. After that you have many options
and I recommend testing each for perforamnce on your system. For your
data layout I suspect non-collective I/O may lead to the best results,
though you should test with collective I/O as well.
Basically you have explicit offsets (no need even for a new file view as
long as you convert offsets to bytes), individual file pointers, and
shared file pointers. Sahred file pointers are unlikely to give you the
desired performance, but I would try them just for giggles. Once you
have done the setup part, changing between the different choices is
quite simple.
Post by Jim HillPost by David CronkHope this helps (and that this was not a homework problem).
It does and it was not. The project that I work on makes extensive use
of visualization dumps. Currently we do that via a gather to rank-0,
which does the file I/O while all the other processors hang out, tell
fishing stories, check up on the wife and kids, the usual. When rank-0
finishes up, it's back to work. Based on rough profiling I've done,
we're spending about 75% of our viz dump time with the current file I/O.
Not elegant but it gets the job done.
More and more apps are running into the I/O bottleneck. I expect
parallel I/O to become essential to most HPC apps in the near future.
Sounds like the other PEs have no other work they can be doing while
waiting on the I/O so perhaps non-blocking I/O will not help here.
Though, if possible, you may want to investigate making some changes to
the code to allow overlap of I/O and computation. Just a thought. You
know the code so you should be a good judge of if it is worth the effort.
One other thing here, if performance becomes critical, advanced features
of MPI-I/O should be explored, like the app controling file striping and
other things that may be controlable through use of the "info" argument
in the MPI-I/O calls.
Post by Jim HillHowever, the problems that our users are solving are getting ever-larger
and the portion of problem time dedicated to viz dumps is growing to
unacceptable levels. One way to minimize this is to ask the users to
request less-frequent dumps with less data per. This has been greeted
with a notable lack of enthusiasm. So yours truly suggested parallel
I/O and as the first person to pipe up received the plum assignment of
implementing it. We're reasonably well load-balanced but I think the
only place where you find a perfect 1/numprocs decomposition is in
reference materials which then take advantage of that decomposition to
present a nonrealistic source code example (e.g., they all use an
identical value of "buflen" for the quantity of data a particular
processor will be writing so rank*buflen is an obvious offset into the
file).
I have talked to some of our HPC folks here about the task at hand and
they were pretty keen so it looks like I'll get more local support than
I was anticipating.
Thanks for taking the time to reply and you can sleep easy knowing you
weren't an inadvertent party to academic misconduct.
Glad I could help and that I did not fall into a trap of a student
looking for unfair help :)
Good luck with this effort.
Dave
--
Dr. David Cronk, Ph.D. phone: (865) 974-3735
Research Director fax: (865) 974-8296
Innovative Computing Lab http://www.cs.utk.edu/~cronk
University of Tennessee, Knoxville