AITuning: Machine Learning-based Tuning Tool for Run-Time Communication Libraries
This work addresses performance optimization in parallel computing for developers and researchers, though it is incremental as it applies existing reinforcement learning methods to a specific domain.
The paper tackles the problem of tuning communication libraries for better performance in parallel applications by using a deep reinforcement learning approach, demonstrating its application to the OpenCoarrays library and achieving improved performance without human intervention.
In this work, we address the problem of tuning communication libraries by using a deep reinforcement learning approach. Reinforcement learning is a machine learning technique incredibly effective in solving game-like situations. In fact, tuning a set of parameters in a communication library in order to get better performance in a parallel application can be expressed as a game: Find the right combination/path that provides the best reward. Even though AITuning has been designed to be utilized with different run-time libraries, we focused this work on applying it to the OpenCoarrays run-time communication library, built on top of MPI-3. This work not only shows the potential of using a reinforcement learning algorithm for tuning communication libraries, but also demonstrates how the MPI Tool Information Interface, introduced by the MPI-3 standard, can be used effectively by run-time libraries to improve the performance without human intervention.