Assessing the Generalizability of a Performance Predictive Model
This work addresses the issue of poor generalization in automated algorithm selection for researchers and practitioners, but it appears incremental as it focuses on assessment rather than a new solution.
The study tackled the problem of predictive models for algorithm performance failing to generalize to unseen problem instances by proposing a workflow to estimate generalizability across benchmark suites, with results showing that patterns in feature space correspond to those in performance space.
A key component of automated algorithm selection and configuration, which in most cases are performed using supervised machine learning (ML) methods is a good-performing predictive model. The predictive model uses the feature representation of a set of problem instances as input data and predicts the algorithm performance achieved on them. Common machine learning models struggle to make predictions for instances with feature representations not covered by the training data, resulting in poor generalization to unseen problems. In this study, we propose a workflow to estimate the generalizability of a predictive model for algorithm performance, trained on one benchmark suite to another. The workflow has been tested by training predictive models across benchmark suites and the results show that generalizability patterns in the landscape feature space are reflected in the performance space.