SEAILGPFFeb 5, 2024

Predicting Configuration Performance in Multiple Environments with Sequential Meta-learning

arXiv:2402.03183v123 citationsh-index: 5Has CodeProc. ACM Softw. Eng.
Originality Highly original
AI Analysis

This addresses the challenge for software engineers in accurately modeling configuration performance across diverse environments, representing a novel method for a known bottleneck rather than a foundational advance.

The paper tackles the problem of predicting software configuration performance across multiple environments by proposing SeMPL, a sequential meta-learning framework that trains on environments one at a time rather than in parallel. The results show that SeMPL outperforms 15 state-of-the-art models on 89% of systems with up to 99% accuracy improvement and achieves up to 3.86x speedup.

Learning and predicting the performance of given software configurations are of high importance to many software engineering activities. While configurable software systems will almost certainly face diverse running environments (e.g., version, hardware, and workload), current work often either builds performance models under a single environment or fails to properly handle data from diverse settings, hence restricting their accuracy for new environments. In this paper, we target configuration performance learning under multiple environments. We do so by designing SeMPL - a meta-learning framework that learns the common understanding from configurations measured in distinct (meta) environments and generalizes them to the unforeseen, target environment. What makes it unique is that unlike common meta-learning frameworks (e.g., MAML and MetaSGD) that train the meta environments in parallel, we train them sequentially, one at a time. The order of training naturally allows discriminating the contributions among meta environments in the meta-model built, which fits better with the characteristic of configuration data that is known to dramatically differ between different environments. Through comparing with 15 state-of-the-art models under nine systems, our extensive experimental results demonstrate that SeMPL performs considerably better on 89% of the systems with up to 99% accuracy improvement, while being data-efficient, leading to a maximum of 3.86x speedup. All code and data can be found at our repository: https://github.com/ideas-labo/SeMPL.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes