LGJul 30, 2022
HPO X ELA: Investigating Hyperparameter Optimization Landscapes by Means of Exploratory Landscape AnalysisLennart Schneider, Lennart Schäpermeier, Raphael Patrick Prager et al.
Hyperparameter optimization (HPO) is a key component of machine learning models for achieving peak predictive performance. While numerous methods and algorithms for HPO have been proposed over the last years, little progress has been made in illuminating and examining the actual structure of these black-box optimization problems. Exploratory landscape analysis (ELA) subsumes a set of techniques that can be used to gain knowledge about properties of unknown optimization problems. In this paper, we evaluate the performance of five different black-box optimizers on 30 HPO problems, which consist of two-, three- and five-dimensional continuous search spaces of the XGBoost learner trained on 10 different data sets. This is contrasted with the performance of the same optimizers evaluated on 360 problem instances from the black-box optimization benchmark (BBOB). We then compute ELA features on the HPO and BBOB problems and examine similarities and differences. A cluster analysis of the HPO and BBOB problems in ELA feature space allows us to identify how the HPO problems compare to the BBOB problems on a structural meta-level. We identify a subset of BBOB problems that are close to the HPO problems in ELA feature space and show that optimizer performance is comparably similar on these two sets of benchmark problems. We highlight open challenges of ELA for HPO and discuss potential directions of future research and applications.
LGApr 12, 2022
A Collection of Deep Learning-based Feature-Free Approaches for Characterizing Single-Objective Continuous Fitness LandscapesMoritz Vinzent Seiler, Raphael Patrick Prager, Pascal Kerschke et al.
Exploratory Landscape Analysis is a powerful technique for numerically characterizing landscapes of single-objective continuous optimization problems. Landscape insights are crucial both for problem understanding as well as for assessing benchmark set diversity and composition. Despite the irrefutable usefulness of these features, they suffer from their own ailments and downsides. Hence, in this work we provide a collection of different approaches to characterize optimization landscapes. Similar to conventional landscape features, we require a small initial sample. However, instead of computing features based on that sample, we develop alternative representations of the original sample. These range from point clouds to 2D images and, therefore, are entirely feature-free. We demonstrate and validate our devised methods on the BBOB testbed and predict, with the help of Deep Learning, the high-level, expert-based landscape properties such as the degree of multimodality and the existence of funnel structures. The quality of our approaches is on par with methods relying on the traditional landscape features. Thereby, we provide an exciting new perspective on every research area which utilizes problem information such as problem understanding and algorithm design as well as automated algorithm configuration and selection.
AIFeb 16, 2023
Tools for Landscape Analysis of Optimisation Problems in Procedural Content Generation for GamesVanessa Volz, Boris Naujoks, Pascal Kerschke et al.
The term Procedural Content Generation (PCG) refers to the (semi-)automatic generation of game content by algorithmic means, and its methods are becoming increasingly popular in game-oriented research and industry. A special class of these methods, which is commonly known as search-based PCG, treats the given task as an optimisation problem. Such problems are predominantly tackled by evolutionary algorithms. We will demonstrate in this paper that obtaining more information about the defined optimisation problem can substantially improve our understanding of how to approach the generation of content. To do so, we present and discuss three efficient analysis tools, namely diagonal walks, the estimation of high-level properties, as well as problem similarity measures. We discuss the purpose of each of the considered methods in the context of PCG and provide guidelines for the interpretation of the results received. This way we aim to provide methods for the comparison of PCG approaches and eventually, increase the quality and practicality of generated content in industry.
LGNov 26, 2022
Mixture of Decision Trees for Interpretable Machine LearningSimeon Brüggenjürgen, Nina Schaaf, Pascal Kerschke et al.
This work introduces a novel interpretable machine learning method called Mixture of Decision Trees (MoDT). It constitutes a special case of the Mixture of Experts ensemble architecture, which utilizes a linear model as gating function and decision trees as experts. Our proposed method is ideally suited for problems that cannot be satisfactorily learned by a single decision tree, but which can alternatively be divided into subproblems. Each subproblem can then be learned well from a single decision tree. Therefore, MoDT can be considered as a method that improves performance while maintaining interpretability by making each of its decisions understandable and traceable to humans. Our work is accompanied by a Python implementation, which uses an interpretable gating function, a fast learning algorithm, and a direct interface to fine-tuned interpretable visualization methods. The experiments confirm that the implementation works and, more importantly, show the superiority of our approach compared to single decision trees and random forests of similar complexity.
OCJul 1, 2024
R2 v2: The Pareto-compliant R2 Indicator for Better Benchmarking in Bi-objective OptimizationLennart Schäpermeier, Pascal Kerschke
In multi-objective optimization, set-based quality indicators are a cornerstone of benchmarking and performance assessment. They capture the quality of a set of trade-off solutions by reducing it to a scalar number. One of the most commonly used set-based metrics is the R2 indicator, which describes the expected utility of a solution set to a decision-maker under a distribution of utility functions. Typically, this indicator is applied by discretizing the latter distribution, yielding a weakly Pareto-compliant indicator. In consequence, adding a nondominated or dominating solution to a solution set may -- but does not have to -- improve the indicator's value. In this paper, we reinvestigate the R2 indicator under the premise that we have a continuous, uniform distribution of (Tchebycheff) utility functions. We analyze its properties in detail, demonstrating that this continuous variant is indeed Pareto-compliant -- that is, any beneficial solution will improve the metric's value. Additionally, we provide efficient computational procedures that (a) compute this metric for bi-objective problems in $\mathcal O (N \log N)$, and (b) can perform incremental updates to the indicator whenever solutions are added to (or removed from) the current set of solutions, without needing to recompute the indicator for the entire set. As a result, this work contributes to the state-of-the-art Pareto-compliant unary performance metrics, such as the hypervolume indicator, offering an efficient and promising alternative.
AIJul 4, 2024
Dancing to the State of the Art? How Candidate Lists Influence LKH for Solving the Traveling Salesperson ProblemJonathan Heins, Lennart Schäpermeier, Pascal Kerschke et al.
Solving the Traveling Salesperson Problem (TSP) remains a persistent challenge, despite its fundamental role in numerous generalized applications in modern contexts. Heuristic solvers address the demand for finding high-quality solutions efficiently. Among these solvers, the Lin-Kernighan-Helsgaun (LKH) heuristic stands out, as it complements the performance of genetic algorithms across a diverse range of problem instances. However, frequent timeouts on challenging instances hinder the practical applicability of the solver. Within this work, we investigate a previously overlooked factor contributing to many timeouts: The use of a fixed candidate set based on a tree structure. Our investigations reveal that candidate sets based on Hamiltonian circuits contain more optimal edges. We thus propose to integrate this promising initialization strategy, in the form of POPMUSIC, within an efficient restart version of LKH. As confirmed by our experimental studies, this refined TSP heuristic is much more efficient - causing fewer timeouts and improving the performance (in terms of penalized average runtime) by an order of magnitude - and thereby challenges the state of the art in TSP solving.
OCJan 23
BONO-Bench: A Comprehensive Test Suite for Bi-objective Numerical Optimization with Traceable Pareto SetsLennart Schäpermeier, Pascal Kerschke
The evaluation of heuristic optimizers on test problems, better known as \emph{benchmarking}, is a cornerstone of research in multi-objective optimization. However, most test problems used in benchmarking numerical multi-objective black-box optimizers come from one of two flawed approaches: On the one hand, problems are constructed manually, which result in problems with well-understood optimal solutions, but unrealistic properties and biases. On the other hand, more realistic and complex single-objective problems are composited into multi-objective problems, but with a lack of control and understanding of problem properties. This paper proposes an extensive problem generation approach for bi-objective numerical optimization problems consisting of the combination of theoretically well-understood convex-quadratic functions into unimodal and multimodal landscapes with and without global structure. It supports configuration of test problem properties, such as the number of decision variables, local optima, Pareto front shape, plateaus in the objective space, or degree of conditioning, while maintaining theoretical tractability: The optimal front can be approximated to an arbitrary degree of precision regarding Pareto-compliant performance indicators such as the hypervolume or the exact R2 indicator. To demonstrate the generator's capabilities, a test suite of 20 problem categories, called \emph{BONO-Bench}, is created and subsequently used as a basis of an illustrative benchmark study. Finally, the general approach underlying our proposed generator, together with the associated test suite, is publicly released in the Python package \texttt{bonobench} to facilitate reproducible benchmarking.
LGJan 2, 2024
Deep-ELA: Deep Exploratory Landscape Analysis with Self-Supervised Pretrained Transformers for Single- and Multi-Objective Continuous Optimization ProblemsMoritz Vinzent Seiler, Pascal Kerschke, Heike Trautmann
In many recent works, the potential of Exploratory Landscape Analysis (ELA) features to numerically characterize, in particular, single-objective continuous optimization problems has been demonstrated. These numerical features provide the input for all kinds of machine learning tasks on continuous optimization problems, ranging, i.a., from High-level Property Prediction to Automated Algorithm Selection and Automated Algorithm Configuration. Without ELA features, analyzing and understanding the characteristics of single-objective continuous optimization problems is -- to the best of our knowledge -- very limited. Yet, despite their usefulness, as demonstrated in several past works, ELA features suffer from several drawbacks. These include, in particular, (1.) a strong correlation between multiple features, as well as (2.) its very limited applicability to multi-objective continuous optimization problems. As a remedy, recent works proposed deep learning-based approaches as alternatives to ELA. In these works, e.g., point-cloud transformers were used to characterize an optimization problem's fitness landscape. However, these approaches require a large amount of labeled training data. Within this work, we propose a hybrid approach, Deep-ELA, which combines (the benefits of) deep learning and ELA features. Specifically, we pre-trained four transformers on millions of randomly generated optimization problems to learn deep representations of the landscapes of continuous single- and multi-objective optimization problems. Our proposed framework can either be used out-of-the-box for analyzing single- and multi-objective continuous optimization problems, or subsequently fine-tuned to various tasks focussing on algorithm behavior and problem understanding.
NEMay 1, 2025
To Repair or Not to Repair? Investigating the Importance of AB-Cycles for the State-of-the-Art TSP Heuristic EAXJonathan Heins, Darrell Whitley, Pascal Kerschke
The Edge Assembly Crossover (EAX) algorithm is the state-of-the-art heuristic for solving the Traveling Salesperson Problem (TSP). It regularly outperforms other methods, such as the Lin-Kernighan-Helsgaun heuristic (LKH), across diverse sets of TSP instances. Essentially, EAX employs a two-stage mechanism that focuses on improving the current solutions, first, at the local and, subsequently, at the global level. Although the second phase of the algorithm has been thoroughly studied, configured, and refined in the past, in particular, its first stage has hardly been examined. In this paper, we thus focus on the first stage of EAX and introduce a novel method that quickly verifies whether the AB-cycles, generated during its internal optimization procedure, yield valid tours -- or whether they need to be repaired. Knowledge of the latter is also particularly relevant before applying other powerful crossover operators such as the Generalized Partition Crossover (GPX). Based on our insights, we propose and evaluate several improved versions of EAX. According to our benchmark study across 10 000 different TSP instances, the most promising of our proposed EAX variants demonstrates improved computational efficiency and solution quality on previously rather difficult instances compared to the current state-of-the-art EAX algorithm.
NENov 29, 2020
To Boldly Show What No One Has Seen Before: A Dashboard for Visualizing Multi-objective LandscapesLennart Schäpermeier, Christian Grimme, Pascal Kerschke
Simultaneously visualizing the decision and objective space of continuous multi-objective optimization problems (MOPs) recently provided key contributions in understanding the structure of their landscapes. For the sake of advancing these recent findings, we compiled all state-of-the-art visualization methods in a single R-package (moPLOT). Moreover, we extended these techniques to handle three-dimensional decision spaces and propose two solutions for visualizing the resulting volume of data points. This enables - for the first time - to illustrate the landscape structures of three-dimensional MOPs. However, creating these visualizations using the aforementioned framework still lays behind a high barrier of entry for many people as it requires basic skills in R. To enable any user to create and explore MOP landscapes using moPLOT, we additionally provide a dashboard that allows to compute the state-of-the-art visualizations for a wide variety of common benchmark functions through an interactive (web-based) user interface.
NEOct 2, 2020
Multiobjectivization of Local Search: Single-Objective Optimization Benefits From Multi-Objective Gradient DescentVera Steinhoff, Pascal Kerschke, Pelin Aspar et al.
Multimodality is one of the biggest difficulties for optimization as local optima are often preventing algorithms from making progress. This does not only challenge local strategies that can get stuck. It also hinders meta-heuristics like evolutionary algorithms in convergence to the global optimum. In this paper we present a new concept of gradient descent, which is able to escape local traps. It relies on multiobjectivization of the original problem and applies the recently proposed and here slightly modified multi-objective local search mechanism MOGSA. We use a sophisticated visualization technique for multi-objective problems to prove the working principle of our idea. As such, this work highlights the transfer of new insights from the multi-objective to the single-objective domain and provides first visual evidence that multiobjectivization can link single-objective local optima in multimodal landscapes.
NEJul 7, 2020
Benchmarking in Optimization: Best Practice and Open IssuesThomas Bartz-Beielstein, Carola Doerr, Daan van den Berg et al.
This survey compiles ideas and recommendations from more than a dozen researchers with different backgrounds and from different institutes around the world. Promoting best practice in benchmarking is its main goal. The article discusses eight essential topics in benchmarking: clearly stated goals, well-specified problems, suitable algorithms, adequate performance measures, thoughtful analysis, effective and efficient designs, comprehensible presentations, and guaranteed reproducibility. The final goal is to provide well-accepted guidelines (rules) that might be useful for authors and reviewers. As benchmarking in optimization is an active and evolving field of research this manuscript is meant to co-evolve over time by means of periodic updates.
LGJun 29, 2020
Deep Learning as a Competitive Feature-Free Approach for Automated Algorithm Selection on the Traveling Salesperson ProblemMoritz Seiler, Janina Pohl, Jakob Bossek et al.
In this work we focus on the well-known Euclidean Traveling Salesperson Problem (TSP) and two highly competitive inexact heuristic TSP solvers, EAX and LKH, in the context of per-instance algorithm selection (AS). We evolve instances with 1,000 nodes where the solvers show strongly different performance profiles. These instances serve as a basis for an exploratory study on the identification of well-discriminating problem characteristics (features). Our results in a nutshell: we show that even though (1) promising features exist, (2) these are in line with previous results from the literature, and (3) models trained with these features are more accurate than models adopting sophisticated feature selection methods, the advantage is not close to the virtual best solver in terms of penalized average runtime and so is the performance gain over the single best solver. However, we show that a feature-free deep neural network based approach solely based on visual representation of the instances already matches classical AS model results and thus shows huge potential for future studies.
NEJun 25, 2020
Empirical Study on the Benefits of Multiobjectivization for Solving Single-Objective ProblemsVera Steinhoff, Pascal Kerschke, Christian Grimme
When dealing with continuous single-objective problems, multimodality poses one of the biggest difficulties for global optimization. Local optima are often preventing algorithms from making progress and thus pose a severe threat. In this paper we analyze how single-objective optimization can benefit from multiobjectivization by considering an additional objective. With the use of a sophisticated visualization technique based on the multi-objective gradients, the properties of the arising multi-objective landscapes are illustrated and examined. We will empirically show that the multi-objective optimizer MOGSA is able to exploit these properties to overcome local traps. The performance of MOGSA is assessed on a testbed of several functions provided by the COCO platform. The results are compared to the local optimizer Nelder-Mead.
NEJun 20, 2020
One PLOT to Show Them All: Visualization of Efficient Sets in Multi-Objective LandscapesLennart Schäpermeier, Christian Grimme, Pascal Kerschke
Visualization techniques for the decision space of continuous multi-objective optimization problems (MOPs) are rather scarce in research. For long, all techniques focused on global optimality and even for the few available landscape visualizations, e.g., cost landscapes, globality is the main criterion. In contrast, the recently proposed gradient field heatmaps (GFHs) emphasize the location and attraction basins of local efficient sets, but ignore the relation of sets in terms of solution quality. In this paper, we propose a new and hybrid visualization technique, which combines the advantages of both approaches in order to represent local and global optimality together within a single visualization. Therefore, we build on the GFH approach but apply a new technique for approximating the location of locally efficient points and using the divergence of the multi-objective gradient vector field as a robust second-order condition. Then, the relative dominance relationship of the determined locally efficient points is used to visualize the complete landscape of the MOP. Augmented by information on the basins of attraction, this Plot of Landscapes with Optimal Trade-offs (PLOT) becomes one of the most informative multi-objective landscape visualization techniques available.
LGMay 27, 2020
Enhancing Resilience of Deep Learning Networks by Means of Transferable AdversariesMoritz Seiler, Heike Trautmann, Pascal Kerschke
Artificial neural networks in general and deep learning networks in particular established themselves as popular and powerful machine learning algorithms. While the often tremendous sizes of these networks are beneficial when solving complex tasks, the tremendous number of parameters also causes such networks to be vulnerable to malicious behavior such as adversarial perturbations. These perturbations can change a model's classification decision. Moreover, while single-step adversaries can easily be transferred from network to network, the transfer of more powerful multi-step adversaries has - usually -- been rather difficult. In this work, we introduce a method for generating strong ad-versaries that can easily (and frequently) be transferred between different models. This method is then used to generate a large set of adversaries, based on which the effects of selected defense methods are experimentally assessed. At last, we introduce a novel, simple, yet effective approach to enhance the resilience of neural networks against adversaries and benchmark it against established defense methods. In contrast to the already existing methods, our proposed defense approach is much more efficient as it only requires a single additional forward-pass to achieve comparable performance results.
AIMay 27, 2020
Anytime Behavior of Inexact TSP Solvers and Perspectives for Automated Algorithm SelectionJakob Bossek, Pascal Kerschke, Heike Trautmann
The Traveling-Salesperson-Problem (TSP) is arguably one of the best-known NP-hard combinatorial optimization problems. The two sophisticated heuristic solvers LKH and EAX and respective (restart) variants manage to calculate close-to optimal or even optimal solutions, also for large instances with several thousand nodes in reasonable time. In this work we extend existing benchmarking studies by addressing anytime behaviour of inexact TSP solvers based on empirical runtime distributions leading to an increased understanding of solver behaviour and the respective relation to problem hardness. It turns out that performance ranking of solvers is highly dependent on the focused approximation quality. Insights on intersection points of performances offer huge potential for the construction of hybridized solvers depending on instance features. Moreover, instance features tailored to anytime performance and corresponding performance indicators will highly improve automated algorithm selection models by including comprehensive information on solver quality.
NEMar 30, 2020
Initial Design Strategies and their Effects on Sequential Model-Based OptimizationJakob Bossek, Carola Doerr, Pascal Kerschke
Sequential model-based optimization (SMBO) approaches are algorithms for solving problems that require computationally or otherwise expensive function evaluations. The key design principle of SMBO is a substitution of the true objective function by a surrogate, which is used to propose the point(s) to be evaluated next. SMBO algorithms are intrinsically modular, leaving the user with many important design choices. Significant research efforts go into understanding which settings perform best for which type of problems. Most works, however, focus on the choice of the model, the acquisition function, and the strategy used to optimize the latter. The choice of the initial sampling strategy, however, receives much less attention. Not surprisingly, quite diverging recommendations can be found in the literature. We analyze in this work how the size and the distribution of the initial sample influences the overall quality of the efficient global optimization~(EGO) algorithm, a well-known SMBO approach. While, overall, small initial budgets using Halton sampling seem preferable, we also observe that the performance landscape is rather unstructured. We furthermore identify several situations in which EGO performs unfavorably against random sampling. Both observations indicate that an adaptive SMBO design could be beneficial, making SMBO an interesting test-bed for automated algorithm design.
NEFeb 4, 2020
The Node Weight Dependent Traveling Salesperson Problem: Approximation Algorithms and Randomized Search HeuristicsJakob Bossek, Katrin Casel, Pascal Kerschke et al.
Several important optimization problems in the area of vehicle routing can be seen as a variant of the classical Traveling Salesperson Problem (TSP). In the area of evolutionary computation, the traveling thief problem (TTP) has gained increasing interest over the last 5 years. In this paper, we investigate the effect of weights on such problems, in the sense that the cost of traveling increases with respect to the weights of nodes already visited during a tour. This provides abstractions of important TSP variants such as the Traveling Thief Problem and time dependent TSP variants, and allows to study precisely the increase in difficulty caused by weight dependence. We provide a 3.59-approximation for this weight dependent version of TSP with metric distances and bounded positive weights. Furthermore, we conduct experimental investigations for simple randomized local search with classical mutation operators and two variants of the state-of-the-art evolutionary algorithm EAX adapted to the weighted TSP. Our results show the impact of the node weights on the position of the nodes in the resulting tour.
NEDec 19, 2019
One-Shot Decision-Making with and without SurrogatesJakob Bossek, Pascal Kerschke, Aneta Neumann et al.
One-shot decision making is required in situations in which we can evaluate a fixed number of solution candidates but do not have any possibility for further, adaptive sampling. Such settings are frequently encountered in neural network design, hyper-parameter optimization, and many simulation-based real-world optimization tasks, in which evaluations are costly and time sparse. It seems intuitive that well-distributed samples should be more meaningful in one-shot decision making settings than uniform or grid-based samples, since they show a better coverage of the decision space. In practice, quasi-random designs such as Latin Hypercube Samples and low-discrepancy point sets form indeed the state of the art, as confirmed by a number of recent studies and competitions. In this work we take a closer look into the correlation between the distribution of the quasi-random designs and their performance in one-shot decision making tasks, with the goal to investigate whether the assumed correlation between uniform distribution and performance can be confirmed. We study three different decision tasks: classic one-shot optimization (only the best sample matters), one-shot optimization with surrogates (allowing to use surrogate models for selecting a design that need not necessarily be one of the evaluated samples), and one-shot regression (i.e., function approximation, with minimization of mean squared error as objective). Our results confirm an advantage of low-discrepancy designs for all three settings. The overall correlation, however, is rather weak. We complement our study by evolving problem-specific samples that show significantly better performance for the regression task than the standard approaches based on low-discrepancy sequences, giving strong indication that significant performance gains over state-of-the-art one-shot sampling techniques are possible.
LGNov 28, 2018
Automated Algorithm Selection: Survey and PerspectivesPascal Kerschke, Holger H. Hoos, Frank Neumann et al.
It has long been observed that for practically any computational problem that has been intensely studied, different instances are best solved using different algorithms. This is particularly pronounced for computationally hard problems, where in most cases, no single algorithm defines the state of the art; instead, there is a set of algorithms with complementary strengths. This performance complementarity can be exploited in various ways, one of which is based on the idea of selecting, from a set of given algorithms, for each problem instance to be solved the one expected to perform best. The task of automatically selecting an algorithm from a given set is known as the per-instance algorithm selection problem and has been intensely studied over the past 15 years, leading to major improvements in the state of the art in solving a growing number of discrete combinatorial problems, including propositional satisfiability and AI planning. Per-instance algorithm selection also shows much promise for boosting performance in solving continuous and mixed discrete/continuous optimisation problems. This survey provides an overview of research in automated algorithm selection, ranging from early and seminal works to recent and promising application areas. Different from earlier work, it covers applications to discrete and continuous problems, and discusses algorithm selection in context with conceptually related approaches, such as algorithm configuration, scheduling or portfolio selection. Since informative and cheaply computable problem instance features provide the basis for effective per-instance algorithm selection systems, we also provide an overview of such features for discrete and continuous problems. Finally, we provide perspectives on future work in the area and discuss a number of open research challenges.
MLNov 24, 2017
Automated Algorithm Selection on Continuous Black-Box Problems By Combining Exploratory Landscape Analysis and Machine LearningPascal Kerschke, Heike Trautmann
In this paper, we build upon previous work on designing informative and efficient Exploratory Landscape Analysis features for characterizing problems' landscapes and show their effectiveness in automatically constructing algorithm selection models in continuous black-box optimization problems. Focussing on algorithm performance results of the COCO platform of several years, we construct a representative set of high-performing complementary solvers and present an algorithm selection model that - compared to the portfolio's single best solver - on average requires less than half of the resources for solving a given problem. Therefore, there is a huge gain in efficiency compared to classical ensemble methods combined with an increased insight into problem characteristics and algorithm properties by using informative features. Acting on the assumption that the function set of the Black-Box Optimization Benchmark is representative enough for practical applications the model allows for selecting the best suited optimization algorithm within the considered set for unseen problems prior to the optimization itself based on a small sample of function evaluations. Note that such a sample can even be reused for the initial population of an evolutionary (optimization) algorithm so that even the feature costs become negligible.
MLAug 17, 2017
Comprehensive Feature-Based Landscape Analysis of Continuous and Constrained Optimization Problems Using the R-Package flaccoPascal Kerschke
Choosing the best-performing optimizer(s) out of a portfolio of optimization algorithms is usually a difficult and complex task. It gets even worse, if the underlying functions are unknown, i.e., so-called Black-Box problems, and function evaluations are considered to be expensive. In the case of continuous single-objective optimization problems, Exploratory Landscape Analysis (ELA) - a sophisticated and effective approach for characterizing the landscapes of such problems by means of numerical values before actually performing the optimization task itself - is advantageous. Unfortunately, until now it has been quite complicated to compute multiple ELA features simultaneously, as the corresponding code has been - if at all - spread across multiple platforms or at least across several packages within these platforms. This article presents a broad summary of existing ELA approaches and introduces flacco, an R-package for feature-based landscape analysis of continuous and constrained optimization problems. Although its functions neither solve the optimization problem itself nor the related "Algorithm Selection Problem (ASP)", it offers easy access to an essential ingredient of the ASP by providing a wide collection of ELA features on a single platform - even within a single package. In addition, flacco provides multiple visualization techniques, which enhance the understanding of some of these numerical features, and thereby make certain landscape properties more comprehensible. On top of that, we will introduce the package's build-in, as well as web-hosted and hence platform-independent, graphical user interface (GUI), which facilitates the usage of the package - especially for people who are not familiar with R - making it a very convenient toolbox when working towards algorithm selection of continuous single-objective optimization problems.
MLJan 5, 2017
OpenML: An R Package to Connect to the Machine Learning Platform OpenMLGiuseppe Casalicchio, Jakob Bossek, Michel Lang et al.
OpenML is an online machine learning platform where researchers can easily share data, machine learning tasks and experiments as well as organize them online to work and collaborate more efficiently. In this paper, we present an R package to interface with the OpenML platform and illustrate its usage in combination with the machine learning R package mlr. We show how the OpenML package allows R users to easily search, download and upload data sets and machine learning tasks. Furthermore, we also show how to upload results of experiments, share them with others and download results from other users. Beyond ensuring reproducibility of results, the OpenML platform automates much of the drudge work, speeds up research, facilitates collaboration and increases the users' visibility online.
AIJun 8, 2015
ASlib: A Benchmark Library for Algorithm SelectionBernd Bischl, Pascal Kerschke, Lars Kotthoff et al.
The task of algorithm selection involves choosing an algorithm from a set of algorithms on a per-instance basis in order to exploit the varying performance of algorithms over a set of instances. The algorithm selection problem is attracting increasing attention from researchers and practitioners in AI. Years of fruitful applications in a number of domains have resulted in a large amount of data, but the community lacks a standard format or repository for this data. This situation makes it difficult to share and compare different approaches effectively, as is done in other, more established fields. It also unnecessarily hinders new researchers who want to work in this area. To address this problem, we introduce a standardized format for representing algorithm selection scenarios and a repository that contains a growing number of data sets from the literature. Our format has been designed to be able to express a wide variety of different scenarios. Demonstrating the breadth and power of our platform, we describe a set of example experiments that build and evaluate algorithm selection models through a common interface. The results display the potential of algorithm selection to achieve significant performance improvements across a broad range of problems and algorithms.
AIMar 26, 2015
Averaged Hausdorff Approximations of Pareto Fronts based on Multiobjective Estimation of Distribution AlgorithmsLuis Marti, Christian Grimme, Pascal Kerschke et al.
In the a posteriori approach of multiobjective optimization the Pareto front is approximated by a finite set of solutions in the objective space. The quality of the approximation can be measured by different indicators that take into account the approximation's closeness to the Pareto front and its distribution along the Pareto front. In particular, the averaged Hausdorff indicator prefers an almost uniform distribution. An observed drawback of multiobjective estimation of distribution algorithms (MEDAs) is that - as common for randomized metaheuristics - the final population usually is not uniformly distributed along the Pareto front. Therefore, we propose a postprocessing strategy which consists of applying the averaged Hausdorff indicator to the complete archive of generated solutions after optimization in order to select a uniformly distributed subset of nondominated solutions from the archive. In this paper, we put forward a strategy for extracting the above described subset. The effectiveness of the proposal is contrasted in a series of experiments that involve different MEDAs and filtering techniques.