Using Bad Learners to find Good Configurations
This addresses the problem of expensive configuration optimization for software engineers, offering a practical, incremental improvement over existing methods.
The paper tackles the challenge of finding optimal software configurations by proposing a rank-based approach that uses cheap, inaccurate performance models to rank configurations, reducing the cost and time needed for model building. The approach was beneficial in 16 out of 21 scenarios based on experiments with 9 software systems.
Finding the optimally performing configuration of a software system for a given setting is often challenging. Recent approaches address this challenge by learning performance models based on a sample set of configurations. However, building an accurate performance model can be very expensive (and is often infeasible in practice). The central insight of this paper is that exact performance values (e.g. the response time of a software system) are not required to rank configurations and to identify the optimal one. As shown by our experiments, models that are cheap to learn but inaccurate (with respect to the difference between actual and predicted performance) can still be used rank configurations and hence find the optimal configuration. This novel \emph{rank-based approach} allows us to significantly reduce the cost (in terms of number of measurements of sample configuration) as well as the time required to build models. We evaluate our approach with 21 scenarios based on 9 software systems and demonstrate that our approach is beneficial in 16 scenarios; for the remaining 5 scenarios, an accurate model can be built by using very few samples anyway, without the need for a rank-based approach.