DCLGPFMay 27, 2022

Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications

arXiv:2205.13963v15 citationsh-index: 39
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of analyzing parallel program dynamics for researchers and developers in high-performance computing, though it appears incremental as it builds on existing techniques with a new visualization method.

The paper tackled the problem of identifying and characterizing desynchronization patterns in MPI-parallel applications by applying data analytics and machine learning techniques to performance data, showing that these patterns can be identified from a much smaller dataset than a full MPI trace.

This paper studies the utility of using data analytics and machine learning techniques for identifying, classifying, and characterizing the dynamics of large-scale parallel (MPI) programs. To this end, we run microbenchmarks and realistic proxy applications with the regular compute-communicate structure on two different supercomputing platforms and choose the per-process performance and MPI time per time step as relevant observables. Using principal component analysis, clustering techniques, correlation functions, and a new "phase space plot," we show how desynchronization patterns (or lack thereof) can be readily identified from a data set that is much smaller than a full MPI trace. Our methods also lead the way towards a more general classification of parallel program dynamics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes