Post by Greg LindahlPost by Sebastian HanigkBut this only works for eager sends or receives! If the amount of data
you're about to transfer exceeds some buffer limit, even the i-routines
will behave like the synchronous ones.
Not only is this implementation-dependent behavior, but your comment
doesn't make any sense. MPI_RECV always blocks until the data is
available. MPI_IRECV never does. So no, large transfers never make
MPI_IRECV behave like MPI_RECV. With IRECV, the blocking happens at
the MPI_WAIT.
I'm sorry for any misunderstanding, my comment above has been written in
a slight hurry ...
Regarding MPI_Irecv I cannot say anything at the moment - I strongly
assume that your description should be expected. But its complementary
sending routine switches from immediate return to blocking behaviour
after exceeding an implementation-dependend message size threshold.
Post by Greg LindahlAnd there is usually only one MPI_WAIT, no matter how many dimensions
your halo exchange has.
Yes. But if your halo's exchange buffer size is larger than the
implementation's threshold, you will end up with blocking behaviour on
each exchange while the zero-copy RDMA (without any connotation I'm
perhaps unaware of) access can obviate this.
Post by Greg LindahlNow perhaps you're using a funny definition of "synchronization". But
it doesn't sound like a useful one.
I don't think I have given or used an unusual definition of
synchronisation; in MPI, there is an implicit synchronisation between
the sending and receiving party hidden in the respective calls to the
send or receive routines, with the exception of the immediate versions of
those routines whose behaviour depends on the transfer size.
Could it be that this discussion goes in some kind of circle while we're
misunderstanding each other? I'm in no way dismissing MPI as inferior,
but for some purposes it is very nice to have the means for one-sided,
passive-target communication available. Without doubt the RDMA scheme
has its own set of problems (I just remembered a short article:
<http://www.hpcwire.com/hpc/815242.html>), I'm still struggling with the
registration/pinning issues - compute node kernels without swapping
capability are a godsend for that purpose.
Sebastian