Speedup Techniques for Irregular Sparse All-To-All
- Subject:Collective Operations in HPC
High Performance Computing (HPC) runs applications on distributed systems with thousands or millions of processing elements (PEs). Those supercomputer communicate over a high-speed network such as the well known InfiniBand or other proprietary high-end networks.
Many applications spend a large part of their execution time on communication which scatters data over the network. Regular communication patterns are All-To-All and Stencil exchanges. If the data is not well distributed, irregular (and even sparse) communication primitives are required. This opens the interesting field of irregular sparse communication where each process sends a logarithmic number of messages. However, simple point to point communications can’t be applied and All-To-Allv (MPI) does not scale very well. Even constructing meta-data for each potential target PE on the local machine would dominate the time of the sparse communication.
Experiments have shown that data exchange algorithms which overflow the network with non-blocking messages will break for large PEs due to “bad” routing hardware. One could limit the number of messages which are routed through the system to avoid overflows. The target is to design new communication primitives which limit the number of sent and received messages at the same time. Algorithmic approaches such as matchings or edge colorings have to be transferred to distributed systems and integrated into the communication primitive to define the data exchange pattern. This thesis shall apply, develop, implement and benchmark these approaches such that irregular sparse All-to-All operations scale up to a large magnitude of PEs.
- Interest in parallel algorithms and HPC
- Good programming skills in C++
- Basic parallel programming skills in MPI
This description is also available PDF.
Application deadline 10th March 2016