Peter Korošec

LG
8papers
83citations
Novelty36%
AI Score22

8 Papers

NEApr 15, 2022
The Importance of Landscape Features for Performance Prediction of Modular CMA-ES Variants

Ana Kostovska, Diederick Vermetten, Sašo Džeroski et al.

Selecting the most suitable algorithm and determining its hyperparameters for a given optimization problem is a challenging task. Accurately predicting how well a certain algorithm could solve the problem is hence desirable. Recent studies in single-objective numerical optimization show that supervised machine learning methods can predict algorithm performance using landscape features extracted from the problem instances. Existing approaches typically treat the algorithms as black-boxes, without consideration of their characteristics. To investigate in this work if a selection of landscape features that depends on algorithms properties could further improve regression accuracy, we regard the modular CMA-ES framework and estimate how much each landscape feature contributes to the best algorithm performance regression models. Exploratory data analysis performed on this data indicate that the set of most relevant features does not depend on the configuration of individual modules, but the influence that these features have on regression accuracy does. In addition, we have shown that by using classifiers that take the features relevance on the model accuracy, we are able to predict the status of individual modules in the CMA-ES configurations.

LGMar 22, 2022
Explainable Landscape Analysis in Automated Algorithm Performance Prediction

Risto Trajanov, Stefan Dimeski, Martin Popovski et al.

Predicting the performance of an optimization algorithm on a new problem instance is crucial in order to select the most appropriate algorithm for solving that problem instance. For this purpose, recent studies learn a supervised machine learning (ML) model using a set of problem landscape features linked to the performance achieved by the optimization algorithm. However, these models are black-box with the only goal of achieving good predictive performance, without providing explanations which landscape features contribute the most to the prediction of the performance achieved by the optimization algorithm. In this study, we investigate the expressiveness of problem landscape features utilized by different supervised ML models in automated algorithm performance prediction. The experimental results point out that the selection of the supervised ML method is crucial, since different supervised ML regression models utilize the problem landscape features differently and there is no common pattern with regard to which landscape features are the most informative.

LGJun 8, 2023
DynamoRep: Trajectory-Based Population Dynamics for Classification of Black-box Optimization Problems

Gjorgjina Cenikj, Gašper Petelin, Carola Doerr et al.

The application of machine learning (ML) models to the analysis of optimization algorithms requires the representation of optimization problems using numerical features. These features can be used as input for ML models that are trained to select or to configure a suitable algorithm for the problem at hand. Since in pure black-box optimization information about the problem instance can only be obtained through function evaluation, a common approach is to dedicate some function evaluations for feature extraction, e.g., using random sampling. This approach has two key downsides: (1) It reduces the budget left for the actual optimization phase, and (2) it neglects valuable information that could be obtained from a problem-solver interaction. In this paper, we propose a feature extraction method that describes the trajectories of optimization algorithms using simple descriptive statistics. We evaluate the generated features for the task of classifying problem classes from the Black Box Optimization Benchmarking (BBOB) suite. We demonstrate that the proposed DynamoRep features capture enough information to identify the problem class on which the optimization algorithm is running, achieving a mean classification accuracy of 95% across all experiments.

LGMay 31, 2023
Assessing the Generalizability of a Performance Predictive Model

Ana Nikolikj, Gjorgjina Cenikj, Gordana Ispirova et al.

A key component of automated algorithm selection and configuration, which in most cases are performed using supervised machine learning (ML) methods is a good-performing predictive model. The predictive model uses the feature representation of a set of problem instances as input data and predicts the algorithm performance achieved on them. Common machine learning models struggle to make predictions for instances with feature representations not covered by the training data, resulting in poor generalization to unseen problems. In this study, we propose a workflow to estimate the generalizability of a predictive model for algorithm performance, trained on one benchmark suite to another. The workflow has been tested by training predictive models across benchmark suites and the results show that generalizability patterns in the landscape feature space are reflected in the performance space.

LGMay 30, 2023
Sensitivity Analysis of RF+clust for Leave-one-problem-out Performance Prediction

Ana Nikolikj, Michal Pluháček, Carola Doerr et al.

Leave-one-problem-out (LOPO) performance prediction requires machine learning (ML) models to extrapolate algorithms' performance from a set of training problems to a previously unseen problem. LOPO is a very challenging task even for state-of-the-art approaches. Models that work well in the easier leave-one-instance-out scenario often fail to generalize well to the LOPO setting. To address the LOPO problem, recent work suggested enriching standard random forest (RF) performance regression models with a weighted average of algorithms' performance on training problems that are considered similar to a test problem. More precisely, in this RF+clust approach, the weights are chosen proportionally to the distances of the problems in some feature space. Here in this work, we extend the RF+clust approach by adjusting the distance-based weights with the importance of the features for performance regression. That is, instead of considering cosine distance in the feature space, we consider a weighted distance measure, with weights depending on the relevance of the feature for the regression model. Our empirical evaluation of the modified RF+clust approach on the CEC 2014 benchmark suite confirms its advantages over the naive distance measure. However, we also observe room for improvement, in particular with respect to more expressive feature portfolios.

NEOct 22, 2021
Explainable Landscape-Aware Optimization Performance Prediction

Risto Trajanov, Stefan Dimeski, Martin Popovski et al.

Efficient solving of an unseen optimization problem is related to appropriate selection of an optimization algorithm and its hyper-parameters. For this purpose, automated algorithm performance prediction should be performed that in most commonly-applied practices involves training a supervised ML algorithm using a set of problem landscape features. However, the main issue of training such models is their limited explainability since they only provide information about the joint impact of the set of landscape features to the end prediction results. In this study, we are investigating explainable landscape-aware regression models where the contribution of each landscape feature to the prediction of the optimization algorithm performance is estimated on a global and local level. The global level provides information about the impact of the feature across all benchmark problems' instances, while the local level provides information about the impact on a specific problem instance. The experimental results are obtained using the COCO benchmark problems and three differently configured modular CMA-ESs. The results show a proof of concept that different set of features are important for different problem instances, which indicates that further personalization of the landscape space is required when training an automated algorithm performance prediction model.

NEApr 27, 2021
A Complementarity Analysis of the COCO Benchmark Problems and Artificially Generated Problems

Urban Škvorc, Tome Eftimov, Peter Korošec

When designing a benchmark problem set, it is important to create a set of benchmark problems that are a good generalization of the set of all possible problems. One possible way of easing this difficult task is by using artificially generated problems. In this paper, one such single-objective continuous problem generation approach is analyzed and compared with the COCO benchmark problem set, a well know problem set for benchmarking numerical optimization algorithms. Using Exploratory Landscape Analysis and Singular Value Decomposition, we show that such representations allow us to further explore the relations between the problems by applying visualization and correlation analysis techniques, with the goal of decreasing the bias in benchmark problem assessment.

NEApr 22, 2021
Personalizing Performance Regression Models to Black-Box Optimization Problems

Tome Eftimov, Anja Jankovic, Gorjan Popovski et al.

Accurately predicting the performance of different optimization algorithms for previously unseen problem instances is crucial for high-performing algorithm selection and configuration techniques. In the context of numerical optimization, supervised regression approaches built on top of exploratory landscape analysis are becoming very popular. From the point of view of Machine Learning (ML), however, the approaches are often rather naive, using default regression or classification techniques without proper investigation of the suitability of the ML tools. With this work, we bring to the attention of our community the possibility to personalize regression models to specific types of optimization problems. Instead of aiming for a single model that works well across a whole set of possibly diverse problems, our personalized regression approach acknowledges that different models may suite different types of problems. Going one step further, we also investigate the impact of selecting not a single regression model per problem, but personalized ensembles. We test our approach on predicting the performance of numerical optimization heuristics on the BBOB benchmark collection.