DCMay 5

Implementing True MPI Sessions and Evaluating MPI Initialization Scalability

arXiv:2605.039839.9
Predicted impact top 28% in DC · last 90 daysOriginality Incremental advance
AI Analysis

For high-performance computing researchers and developers, this work addresses a scalability bottleneck in MPI initialization for exascale systems.

The authors implemented true MPI Sessions in MPICH, removing reliance on MPI_COMM_WORLD, and showed that this refactoring improves initialization scalability, with explicit hierarchical designs providing significant benefits.

Sessions is one of the major features introduced in the MPI-4 standard. It offers an alternative to the traditional world communicator model by allowing applications to construct communicators from process sets, thereby eliminating the dependency on MPI_COMM_WORLD. The Sessions model was proposed as a more scalable solution for exascale systems, where MPI_COMM_WORLD was viewed as a potential scalability bottleneck. However, supporting Sessions is a significant challenge for established codebases like MPICH due to the deep integration of the world model in traditional MPI implementations. Although MPICH added support for the MPI-4 standard upon its release, it still internally relied on a global world communicator. This approach enabled applications written using the Sessions model to function, but it did not fulfill the full design intent of Sessions, which meant to decouple MPI from MPI_COMM_WORLD. We describe MPICH effort to support true MPI Sessions, including a major internal refactoring. We describe the architectural changes required to support true Sessions and evaluate the resulting implementation scalability. Our results demonstrate that true Sessions can offer significant scalability benefits by adopting explicit hierarchical designs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes