Benjamin Doerr

NE
h-index11
95papers
3,060citations
Novelty51%
AI Score58

95 Papers

NEApr 28, 2022
A First Runtime Analysis of the NSGA-II on a Multimodal Problem

Benjamin Doerr, Zhongdi Qu

Very recently, the first mathematical runtime analyses of the multi-objective evolutionary optimizer NSGA-II have been conducted. We continue this line of research with a first runtime analysis of this algorithm on a benchmark problem consisting of two multimodal objectives. We prove that if the population size $N$ is at least four times the size of the Pareto front, then the NSGA-II with four different ways to select parents and bit-wise mutation optimizes the OneJumpZeroJump benchmark with jump size~$2 \le k \le n/4$ in time $O(N n^k)$. When using fast mutation, a recently proposed heavy-tailed mutation operator, this guarantee improves by a factor of $k^{Ω(k)}$. Overall, this work shows that the NSGA-II copes with the local optima of the OneJumpZeroJump problem at least as well as the global SEMO algorithm.

55.2NEJun 2
Speeding Up the NSGA-II via Dynamic Population Sizes

Benjamin Doerr, Martin S. Krejca, Simon Wietheger

Multi-objective evolutionary algorithms (MOEAs) are among the most widely and successfully applied optimizers for multi-objective problems. However, to store many optimal trade-offs (the Pareto optima) simultaneously, MOEAs are typically run with a large population of solution candidates. This slows down the algorithm and renders the choice of the population size a crucial design decision. In this work, we aim to overcome these difficulties by proposing the dynamic NSGA-II, a variant of the well-known NSGA-II that starts with a small initial population and doubles it after a user-specified number $τ$ of function evaluations, up to a maximum size of $N_{max}$. We prove that the dynamic NSGA-II with optimal parameters computes the Pareto front of the OneMinMax benchmark of size $n$ with high probability in $O(n \log^2 n)$ function evaluations, which is considerably faster than the $Θ(n^2 \log n)$ runtime of the static NSGA-II with optimal parameters. For the OneJumpZeroJump benchmark with gap size $k$, we show a runtime of $O(n^k \log^2 n)$, improving upon the known runtime of $Θ(n^{k+1})$. We also propose a variant that uses the initial population size for a longer period and achieves slightly better performance. Finally, we show that a simple concurrent-run strategy turns our dynamic NSGA-II variants into parameter-less algorithms that exceed the above runtimes only by a logarithmic factor and hence still outperform the static NSGA-II by a factor of $\tildeΩ(n)$.

NENov 15, 2022
A Mathematical Runtime Analysis of the Non-dominated Sorting Genetic Algorithm III (NSGA-III)

Simon Wietheger, Benjamin Doerr

The Non-dominated Sorting Genetic Algorithm II (NSGA-II) is the most prominent multi-objective evolutionary algorithm for real-world applications. While it performs evidently well on bi-objective optimization problems, empirical studies suggest that it is less effective when applied to problems with more than two objectives. A recent mathematical runtime analysis confirmed this observation by proving the NGSA-II for an exponential number of iterations misses a constant factor of the Pareto front of the simple 3-objective OneMinMax problem. In this work, we provide the first mathematical runtime analysis of the NSGA-III, a refinement of the NSGA-II aimed at better handling more than two objectives. We prove that the NSGA-III with sufficiently many reference points -- a small constant factor more than the size of the Pareto front, as suggested for this algorithm -- computes the complete Pareto front of the 3-objective OneMinMax benchmark in an expected number of O(n log n) iterations. This result holds for all population sizes (that are at least the size of the Pareto front). It shows a drastic advantage of the NSGA-III over the NSGA-II on this benchmark. The mathematical arguments used here and in previous work on the NSGA-II suggest that similar findings are likely for other benchmarks with three or more objectives.

NEAug 18, 2022
Runtime Analysis for the NSGA-II: Provable Speed-Ups From Crossover

Benjamin Doerr, Zhongdi Qu

Very recently, the first mathematical runtime analyses for the NSGA-II, the most common multi-objective evolutionary algorithm, have been conducted. Continuing this research direction, we prove that the NSGA-II optimizes the OneJumpZeroJump benchmark asymptotically faster when crossover is employed. Together with a parallel independent work by Dang, Opris, Salehi, and Sudholt, this is the first time such an advantage of crossover is proven for the NSGA-II. Our arguments can be transferred to single-objective optimization. They then prove that crossover can speed up the $(μ+1)$ genetic algorithm in a different way and more pronounced than known before. Our experiments confirm the added value of crossover and show that the observed advantages are even larger than what our proofs can guarantee.

NENov 23, 2022
Runtime Analysis for the NSGA-II: Proving, Quantifying, and Explaining the Inefficiency For Many Objectives

Weijie Zheng, Benjamin Doerr

The NSGA-II is one of the most prominent algorithms to solve multi-objective optimization problems. Despite numerous successful applications, several studies have shown that the NSGA-II is less effective for larger numbers of objectives. In this work, we use mathematical runtime analyses to rigorously demonstrate and quantify this phenomenon. We show that even on the simple $m$-objective generalization of the discrete OneMinMax benchmark, where every solution is Pareto optimal, the NSGA-II also with large population sizes cannot compute the full Pareto front (objective vectors of all Pareto optima) in sub-exponential time when the number of objectives is at least three. The reason for this unexpected behavior lies in the fact that in the computation of the crowding distance, the different objectives are regarded independently. This is not a problem for two objectives, where any sorting of a pair-wise incomparable set of solutions according to one objective is also such a sorting according to the other objective (in the inverse order).

NESep 28, 2022
From Understanding the Population Dynamics of the NSGA-II to the First Proven Lower Bounds

Benjamin Doerr, Zhongdi Qu

Due to the more complicated population dynamics of the NSGA-II, none of the existing runtime guarantees for this algorithm is accompanied by a non-trivial lower bound. Via a first mathematical understanding of the population dynamics of the NSGA-II, that is, by estimating the expected number of individuals having a certain objective value, we prove that the NSGA-II with suitable population size needs $Ω(Nn\log n)$ function evaluations to find the Pareto front of the OneMinMax problem and $Ω(Nn^k)$ evaluations on the OneJumpZeroJump problem with jump size $k$. These bounds are asymptotically tight (that is, they match previously shown upper bounds) and show that the NSGA-II here does not even in terms of the parallel runtime (number of iterations) profit from larger population sizes. For the OneJumpZeroJump problem and when the same sorting is used for the computation of the crowding distance contributions of the two objectives, we even obtain a runtime estimate that is tight including the leading constant.

NAOct 5, 2013
A Lower Bound for the Discrepancy of a Random Point Set

Benjamin Doerr

We show that there is a constant $K > 0$ such that for all $N, s \in \N$, $s \le N$, the point set consisting of $N$ points chosen uniformly at random in the $s$-dimensional unit cube $[0,1]^s$ with probability at least $1-\exp(-Θ(s))$ admits an axis parallel rectangle $[0,x] \subseteq [0,1]^s$ containing $K \sqrt{sN}$ points more than expected. Consequently, the expected star discrepancy of a random point set is of order $\sqrt{s/N}$.

NEMar 5, 2022
Approximation Guarantees for the Non-Dominated Sorting Genetic Algorithm II (NSGA-II)

Weijie Zheng, Benjamin Doerr

Recent theoretical works have shown that the NSGA-II efficiently computes the full Pareto front when the population size is large enough. In this work, we study how well it approximates the Pareto front when the population size is smaller. For the OneMinMax benchmark, we point out situations in which the parents and offspring cover well the Pareto front, but the next population has large gaps on the Pareto front. Our mathematical proofs suggest as reason for this undesirable behavior that the NSGA-II in the selection stage computes the crowding distance once and then removes individuals with smallest crowding distance without considering that a removal increases the crowding distance of some individuals. We then analyze two variants not prone to this problem. For the NSGA-II that updates the crowding distance after each removal (Kukkonen and Deb (2006)) and the steady-state NSGA-II (Nebro and Durillo (2009)), we prove that the gaps in the Pareto front are never more than a small constant factor larger than the theoretical minimum. This is the first mathematical work on the approximation ability of the NSGA-II and the first runtime analysis for the steady-state NSGA-II. Experiments also show the superior approximation ability of the two NSGA-II variants.

NEOct 7, 2022
The $(1+(λ,λ))$ Global SEMO Algorithm

Benjamin Doerr, Omar El Hadri, Adrien Pinard

The $(1+(λ,λ))$ genetic algorithm is a recently proposed single-objective evolutionary algorithm with several interesting properties. We show that its main working principle, mutation with a high rate and crossover as repair mechanism, can be transported also to multi-objective evolutionary computation. We define the $(1+(λ,λ))$ global SEMO algorithm, a variant of the classic global SEMO algorithm, and prove that it optimizes the OneMinMax benchmark asymptotically faster than the global SEMO. Following the single-objective example, we design a one-fifth rule inspired dynamic parameter setting (to the best of our knowledge for the first time in discrete multi-objective optimization) and prove that it further improves the runtime to $O(n^2)$, whereas the best runtime guarantee for the global SEMO is only $O(n^2 \log n)$.

NEApr 15, 2022
Towards a Stronger Theory for Permutation-based Evolutionary Algorithms

Benjamin Doerr, Yassine Ghannane, Marouane Ibn Brahim

While the theoretical analysis of evolutionary algorithms (EAs) has made significant progress for pseudo-Boolean optimization problems in the last 25 years, only sporadic theoretical results exist on how EAs solve permutation-based problems. To overcome the lack of permutation-based benchmark problems, we propose a general way to transfer the classic pseudo-Boolean benchmarks into benchmarks defined on sets of permutations. We then conduct a rigorous runtime analysis of the permutation-based $(1+1)$ EA proposed by Scharnow, Tinnefeld, and Wegener (2004) on the analogues of the \textsc{LeadingOnes} and \textsc{Jump} benchmarks. The latter shows that, different from bit-strings, it is not only the Hamming distance that determines how difficult it is to mutate a permutation $σ$ into another one $τ$, but also the precise cycle structure of $στ^{-1}$. For this reason, we also regard the more symmetric scramble mutation operator. We observe that it not only leads to simpler proofs, but also reduces the runtime on jump functions with odd jump size by a factor of $Θ(n)$. Finally, we show that a heavy-tailed version of the scramble operator, as in the bit-string case, leads to a speed-up of order $m^{Θ(m)}$ on jump functions with jump size~$m$.%

NEApr 20, 2023
How the Move Acceptance Hyper-Heuristic Copes With Local Optima: Drastic Differences Between Jumps and Cliffs

Benjamin Doerr, Arthur Dremaux, Johannes Lutzeyer et al.

In recent work, Lissovoi, Oliveto, and Warwicker (Artificial Intelligence (2023)) proved that the Move Acceptance Hyper-Heuristic (MAHH) leaves the local optimum of the multimodal cliff benchmark with remarkable efficiency. With its $O(n^3)$ runtime, for almost all cliff widths $d,$ the MAHH massively outperforms the $Θ(n^d)$ runtime of simple elitist evolutionary algorithms (EAs). For the most prominent multimodal benchmark, the jump functions, the given runtime estimates of $O(n^{2m} m^{-Θ(m)})$ and $Ω(2^{Ω(m)})$, for gap size $m \ge 2$, are far apart and the real performance of MAHH is still an open question. In this work, we resolve this question. We prove that for any choice of the MAHH selection parameter~$p$, the expected runtime of the MAHH on a jump function with gap size $m = o(n^{1/2})$ is at least $Ω(n^{2m-1} / (2m-1)!)$. This renders the MAHH much slower than simple elitist evolutionary algorithms with their typical $O(n^m)$ runtime. We also show that the MAHH with the global bit-wise mutation operator instead of the local one-bit operator optimizes jump functions in time $O(\min\{m n^m,\frac{n^{2m-1}}{m!Ω(m)^{m-2}}\})$, essentially the minimum of the optimization times of the $(1+1)$ EA and the MAHH. This suggests that combining several ways to cope with local optima can be a fruitful approach.

NEJul 5, 2022
Runtime Analysis for Permutation-based Evolutionary Algorithms

Benjamin Doerr, Yassine Ghannane, Marouane Ibn Brahim

While the theoretical analysis of evolutionary algorithms (EAs) has made significant progress for pseudo-Boolean optimization problems in the last 25 years, only sporadic theoretical results exist on how EAs solve permutation-based problems. To overcome the lack of permutation-based benchmark problems, we propose a general way to transfer the classic pseudo-Boolean benchmarks into benchmarks defined on sets of permutations. We then conduct a rigorous runtime analysis of the permutation-based $(1+1)$ EA proposed by Scharnow, Tinnefeld, and Wegener (2004) on the analogues of the LeadingOnes and Jump benchmarks. The latter shows that, different from bit-strings, it is not only the Hamming distance that determines how difficult it is to mutate a permutation $σ$ into another one $τ$, but also the precise cycle structure of $στ^{-1}$. For this reason, we also regard the more symmetric scramble mutation operator. We observe that it not only leads to simpler proofs, but also reduces the runtime on jump functions with odd jump size by a factor of $Θ(n)$. Finally, we show that a heavy-tailed version of the scramble operator, as in the bit-string case, leads to a speed-up of order $m^{Θ(m)}$ on jump functions with jump size $m$. A short empirical analysis confirms these findings, but also reveals that small implementation details like the rate of void mutations can make an important difference.

NEApr 21, 2023
How Well Does the Metropolis Algorithm Cope With Local Optima?

Benjamin Doerr, Taha El Ghazi El Houssaini, Amirhossein Rajabi et al.

The Metropolis algorithm (MA) is a classic stochastic local search heuristic. It avoids getting stuck in local optima by occasionally accepting inferior solutions. To better and in a rigorous manner understand this ability, we conduct a mathematical runtime analysis of the MA on the CLIFF benchmark. Apart from one local optimum, cliff functions are monotonically increasing towards the global optimum. Consequently, to optimize a cliff function, the MA only once needs to accept an inferior solution. Despite seemingly being an ideal benchmark for the MA to profit from its main working principle, our mathematical runtime analysis shows that this hope does not come true. Even with the optimal temperature (the only parameter of the MA), the MA optimizes most cliff functions less efficiently than simple elitist evolutionary algorithms (EAs), which can only leave the local optimum by generating a superior solution possibly far away. This result suggests that our understanding of why the MA is often very successful in practice is not yet complete. Our work also suggests to equip the MA with global mutation operators, an idea supported by our preliminary experiments.

NEFeb 16, 2023
Fourier Analysis Meets Runtime Analysis: Precise Runtimes on Plateaus

Benjamin Doerr, Andrew James Kelley

We propose a new method based on discrete Fourier analysis to analyze the time evolutionary algorithms spend on plateaus. This immediately gives a concise proof of the classic estimate of the expected runtime of the $(1+1)$ evolutionary algorithm on the Needle problem due to Garnier, Kallel, and Schoenauer (1999). We also use this method to analyze the runtime of the $(1+1)$ evolutionary algorithm on a new benchmark consisting of $n/\ell$ plateaus of effective size $2^\ell-1$ which have to be optimized sequentially in a LeadingOnes fashion. Using our new method, we determine the precise expected runtime both for static and fitness-dependent mutation rates. We also determine the asymptotically optimal static and fitness-dependent mutation rates. For $\ell = o(n)$, the optimal static mutation rate is approximately $1.59/n$. The optimal fitness dependent mutation rate, when the first $k$ fitness-relevant bits have been found, is asymptotically $1/(k+1)$. These results, so far only proven for the single-instance problem LeadingOnes, thus hold for a much broader class of problems. We expect similar extensions to be true for other important results on LeadingOnes. We are also optimistic that our Fourier analysis approach can be applied to other plateau problems as well.

NEMay 7, 2022
Automated Algorithm Selection for Radar Network Configuration

Quentin Renau, Johann Dreo, Alain Peres et al.

The configuration of radar networks is a complex problem that is often performed manually by experts with the help of a simulator. Different numbers and types of radars as well as different locations that the radars shall cover give rise to different instances of the radar configuration problem. The exact modeling of these instances is complex, as the quality of the configurations depends on a large number of parameters, on internal radar processing, and on the terrains on which the radars need to be placed. Classic optimization algorithms can therefore not be applied to this problem, and we rely on "trial-and-error" black-box approaches. In this paper, we study the performances of 13 black-box optimization algorithms on 153 radar network configuration problem instances. The algorithms perform considerably better than human experts. Their ranking, however, depends on the budget of configurations that can be evaluated and on the elevation profile of the location. We therefore also investigate automated algorithm selection approaches. Our results demonstrate that a pipeline that extracts instance features from the elevation of the terrain performs on par with the classical, far more expensive approach that extracts features from the objective function.

NAMar 25, 2019
The recovery of ridge functions on the hypercube suffers from the curse of dimensionality

Benjamin Doerr, Sebastian Mayer

A multivariate ridge function is a function of the form $f(x) = g(a^{\scriptscriptstyle T} x)$, where $g$ is univariate and $a \in \mathbb{R}^d$. We show that the recovery of an unknown ridge function defined on the hypercube $[-1,1]^d$ with Lipschitz-regular profile $g$ suffers from the curse of dimensionality when the recovery error is measured in the $L_\infty$-norm, even if we allow randomized algorithms. If a limited number of components of $a$ is substantially larger than the others, then the curse of dimensionality is not present and the problem is weakly tractable provided the profile $g$ is sufficiently regular.

NEJun 22, 2022
General Univariate Estimation-of-Distribution Algorithms

Benjamin Doerr, Marc Dufay

We propose a general formulation of a univariate estimation-of-distribution algorithm (EDA). It naturally incorporates the three classic univariate EDAs \emph{compact genetic algorithm}, \emph{univariate marginal distribution algorithm} and \emph{population-based incremental learning} as well as the \emph{max-min ant system} with iteration-best update. Our unified description of the existing algorithms allows a unified analysis of these; we demonstrate this by providing an analysis of genetic drift that immediately gives the existing results proven separately for the four algorithms named above. Our general model also includes EDAs that are more efficient than the existing ones and these may not be difficult to find as we demonstrate for the OneMax and LeadingOnes benchmarks.

NEJun 18, 2022
From Understanding Genetic Drift to a Smart-Restart Mechanism for Estimation-of-Distribution Algorithms

Weijie Zheng, Benjamin Doerr

Estimation-of-distribution algorithms (EDAs) are optimization algorithms that learn a distribution on the search space from which good solutions can be sampled easily. A key parameter of most EDAs is the sample size (population size). If the population size is too small, the update of the probabilistic model builds on few samples, leading to the undesired effect of genetic drift. Too large population sizes avoid genetic drift, but slow down the process. Building on a recent quantitative analysis of how the population size leads to genetic drift, we design a smart-restart mechanism for EDAs. By stopping runs when the risk for genetic drift is high, it automatically runs the EDA in good parameter regimes. Via a mathematical runtime analysis, we prove a general performance guarantee for this smart-restart scheme. This in particular shows that in many situations where the optimal (problem-specific) parameter values are known, the restart scheme automatically finds these, leading to the asymptotically optimal performance. We also conduct an extensive experimental analysis. On four classic benchmark problems, we clearly observe the critical influence of the population size on the performance, and we find that the smart-restart scheme leads to a performance close to the one obtainable with optimal parameter values. Our results also show that previous theory-based suggestions for the optimal population size can be far from the optimal ones, leading to a performance clearly inferior to the one obtained via the smart-restart scheme. We also conduct experiments with PBIL (cross-entropy algorithm) on two combinatorial optimization problems from the literature, the max-cut problem and the bipartition problem. Again, we observe that the smart-restart mechanism finds much better values for the population size than those suggested in the literature, leading to a much better performance.

41.0NEMay 28
Selection Hyper-heuristics Can Automatically Adjust the Learning Period to Optimally Solve Pseudo-Boolean Problems

Benjamin Doerr, Pietro S. Oliveto, John Alasdair Warwicker

The Random Gradient hyper-heuristic was recently shown to be able to learn the optimal neighbourhood size when optimizing the LeadingOnes benchmark via the Randomised Local Search (RLS) meta-heuristic. However, for this to happen, a learning period of a certain length $τ$ had to be used, differently from classic hyper-heuristics, which change their behaviour based on the success of only the previous iteration. In this paper, we show how to automatically set this new parameter value, relieving the user from the non-trivial task of controlling this novel algorithm parameter. We prove that the resulting hyper-heuristic selects the optimal neighbourhood size in a $1-o(1)$ fraction of the iterations and, consequently, optimises the LeadingOnes benchmark in the best possible time (apart from lower-order terms) achievable with these neighborhood sizes.

NEMar 13, 2023
(1+1) Genetic Programming With Functionally Complete Instruction Sets Can Evolve Boolean Conjunctions and Disjunctions with Arbitrarily Small Error

Benjamin Doerr, Andrei Lissovoi, Pietro S. Oliveto

Recently it has been proven that simple GP systems can efficiently evolve a conjunction of $n$ variables if they are equipped with the minimal required components. In this paper, we make a considerable step forward by analysing the behaviour and performance of a GP system for evolving a Boolean conjunction or disjunction of $n$ variables using a complete function set that allows the expression of any Boolean function of up to $n$ variables. First we rigorously prove that a GP system using the complete truth table to evaluate the program quality, and equipped with both the AND and OR operators and positive literals, evolves the exact target function in $O(\ell n \log^2 n)$ iterations in expectation, where $\ell \geq n$ is a limit on the size of any accepted tree. Additionally, we show that when a polynomial sample of possible inputs is used to evaluate the solution quality, conjunctions or disjunctions with any polynomially small generalisation error can be evolved with probability $1 - O(\log^2(n)/n)$. The latter result also holds if GP uses AND, OR and positive and negated literals, thus has the power to express any Boolean function of $n$ distinct variables. To prove our results we introduce a super-multiplicative drift theorem that gives significantly stronger runtime bounds when the expected progress is only slightly super-linear in the distance from the optimum.

42.9NEMay 27
A Fresh Look at Lamarckian Evolution and the Baldwin Effect

Inès Benito, Johannes F. Lutzeyer, Benjamin Doerr

Baldwinian and Lamarckian evolution have existed for a long time in evolutionary algorithms (EAs) without ever dominating the academic literature or practical applications. In this work, we use modern empirical and theoretical methods to revisit Lamarckian and Baldwinian evolution and rigorously compare them with the generic Darwinian evolution. On the empirical side, we run a comprehensive suite of experiments on graphs from six different datasets from the recent GraphBench benchmark on Maximum Independent Set and Maximum Cut problems. Our results show that Baldwinian and Lamarckian evolution consistently outperform Darwinian evolution, confirming the great potential of local search augmented evolutionary algorithms. Notably, in the great majority of cases, all EAs outperform recent deep learning baselines and approach the performance of highly specialised heuristic and exact solvers. We furthermore report a high-performing set of generalist parameters for all studied evolution types that we hope will be of use to practitioners in future. On the theoretical side, we extend the existing Deceptive Leading Block benchmark to arbitrary block length and use tools from modern theoretical runtime analysis to prove upper and lower bounds on the expected runtime. For block lengths greater than two, Baldwinian evolution is asymptotically faster than Lamarckian which is asymptotically faster than Darwinian evolution. When accounting for the cost of the local search procedure in fitness evaluations, the ordering depends on the implementation with Baldwinian evolution staying fastest from small block lengths onwards, explaining its strong empirical performance.

NEJul 25, 2024
A Crowding Distance That Provably Solves the Difficulties of the NSGA-II in Many-Objective Optimization

Weijie Zheng, Yao Gao, Benjamin Doerr

Recent theoretical works have shown that the NSGA-II can have enormous difficulties to solve problems with more than two objectives. In contrast, algorithms like the NSGA-III or SMS-EMOA, differing from the NSGA-II only in the secondary selection criterion, provably perform well in these situations. To remedy this shortcoming of the NSGA-II, but at the same time keep the advantages of the widely accepted crowding distance, we use the insights of these previous work to define a variant of the crowding distance, called truthful crowding distance. Different from the classic crowding distance, it has for any number of objectives the desirable property that a small crowding distance value indicates that some other solution has a similar objective vector. Building on this property, we conduct mathematical runtime analyses for the NSGA-II with truthful crowding distance. We show that this algorithm can solve the many-objective versions of the OneMinMax, COCZ, LOTZ, and OJZJ$_k$ problems in the same (polynomial) asymptotic runtimes as the NSGA-III and the SMS-EMOA. This contrasts the exponential lower bounds previously shown for the classic NSGA-II. For the bi-objective versions of these problems, our NSGA-II has a similar performance as the classic NSGA-II, gaining however from smaller admissible population sizes. For the bi-objective OneMinMax problem, we also observe a (minimally) better performance in approximating the Pareto front. These results suggest that our truthful version of the NSGA-II has the same good performance as the classic NSGA-II in two objectives, but can resolve the drastic problems in more than two objectives.

48.6NEMay 14
First Mathematical Runtime Analyses of Multi-Objective Evolutionary Algorithms for Multi-Valued Decision Variables

Mingfeng Li, Zheng Cheng, Weijie Zheng et al.

Problems defined on binary decision spaces have been intensively studied in the theory of multi-objective evolutionary algorithms (MOEAs). In contrast, no mathematical runtime analyses exist so far for MOEAs dealing with decision variables that take a finite number $r > 2$ of values, despite the prevalence of such problems in practice. In this work, we begin to fill this research gap. We analyze how the classic SEMO algorithm with unit-strength local mutation computes the Pareto front of an $r$-valued counterpart of the classic \oneminmax benchmark. For the expected number of function evaluations until the Pareto front is covered by the population of this MOEA, we prove an upper bound of $O(n^2 r^2 \log n)$ and a near-tight lower bound of $Ω(n^2 r (r + \log n))$. We can close the small remaining gap between these two bounds by considering a variant of the algorithm that accepts only strictly better solutions; for this variant, we show an upper bound of $O(n^2 r (r + \log n))$, matching our lower bound (which also holds for this variant). Our results suggest that classic MOEAs encounter no significant additional difficulties when dealing with multi-valued decision variables. However, significantly more advanced tools may be required to obtain tight bounds for algorithms with more complex population dynamics.

NEJul 19, 2024
Hyper-Heuristics Can Profit From Global Variation Operators

Benjamin Doerr, Johannes F. Lutzeyer

In recent work, Lissovoi, Oliveto, and Warwicker (Artificial Intelligence (2023)) proved that the Move Acceptance Hyper-Heuristic (MAHH) leaves the local optimum of the multimodal CLIFF benchmark with remarkable efficiency. The $O(n^3)$ runtime of the MAHH, for almost all cliff widths $d\ge 2,$ is significantly better than the $Θ(n^d)$ runtime of simple elitist evolutionary algorithms (EAs) on CLIFF. In this work, we first show that this advantage is specific to the CLIFF problem and does not extend to the JUMP benchmark, the most prominent multi-modal benchmark in the theory of randomized search heuristics. We prove that for any choice of the MAHH selection parameter $p$, the expected runtime of the MAHH on a JUMP function with gap size $m = O(n^{1/2})$ is at least $Ω(n^{2m-1} / (2m-1)!)$. This is significantly slower than the $O(n^m)$ runtime of simple elitist EAs. Encouragingly, we also show that replacing the local one-bit mutation operator in the MAHH with the global bit-wise mutation operator, commonly used in EAs, yields a runtime of $\min\{1, O(\frac{e\ln(n)}{m})^m\} \, O(n^m)$ on JUMP functions. This is at least as good as the runtime of simple elitist EAs. For larger values of $m$, this result proves an asymptotic performance gain over simple EAs. As our proofs reveal, the MAHH profits from its ability to walk through the valley of lower objective values in moderate-size steps, always accepting inferior solutions. This is the first time that such an optimization behavior is proven via mathematical means. Generally, our result shows that combining two ways of coping with local optima, global mutation and accepting inferior solutions, can lead to considerable performance gains.

LGJun 3, 2025Code
HIEGNet: A Heterogenous Graph Neural Network Including the Immune Environment in Glomeruli Classification

Niklas Kormann, Masoud Ramuz, Zeeshan Nisar et al.

Graph Neural Networks (GNNs) have recently been found to excel in histopathology. However, an important histopathological task, where GNNs have not been extensively explored, is the classification of glomeruli health as an important indicator in nephropathology. This task presents unique difficulties, particularly for the graph construction, i.e., the identification of nodes, edges, and informative features. In this work, we propose a pipeline composed of different traditional and machine learning-based computer vision techniques to identify nodes, edges, and their corresponding features to form a heterogeneous graph. We then proceed to propose a novel heterogeneous GNN architecture for glomeruli classification, called HIEGNet, that integrates both glomeruli and their surrounding immune cells. Hence, HIEGNet is able to consider the immune environment of each glomerulus in its classification. Our HIEGNet was trained and tested on a dataset of Whole Slide Images from kidney transplant patients. Experimental results demonstrate that HIEGNet outperforms several baseline models and generalises best between patients among all baseline models. Our implementation is publicly available at https://github.com/nklsKrmnn/HIEGNet.git.

MLJan 12
Position: Don't be Afraid of Over-Smoothing And Over-Squashing

Niklas Kormann, Benjamin Doerr, Johannes F. Lutzeyer

Over-smoothing and over-squashing have been extensively studied in the literature on Graph Neural Networks (GNNs) over the past years. We challenge this prevailing focus in GNN research, arguing that these phenomena are less critical for practical applications than assumed. We suggest that performance decreases often stem from uninformative receptive fields rather than over-smoothing. We support this position with extensive experiments on several standard benchmark datasets, demonstrating that accuracy and over-smoothing are mostly uncorrelated and that optimal model depths remain small even with mitigation techniques, thus highlighting the negligible role of over-smoothing. Similarly, we challenge that over-squashing is always detrimental in practical applications. Instead, we posit that the distribution of relevant information over the graph frequently factorises and is often localised within a small k-hop neighbourhood, questioning the necessity of jointly observing entire receptive fields or engaging in an extensive search for long-range interactions. The results of our experiments show that architectural interventions designed to mitigate over-squashing fail to yield significant performance gains. This position paper advocates for a paradigm shift in theoretical research, urging a diligent analysis of learning tasks and datasets using statistics that measure the underlying distribution of label-relevant information to better understand their localisation and factorisation.

NEMay 3, 2025
Scalable Speed-ups for the SMS-EMOA from a Simple Aging Strategy

Mingfeng Li, Weijie Zheng, Benjamin Doerr

Different from single-objective evolutionary algorithms, where non-elitism is an established concept, multi-objective evolutionary algorithms almost always select the next population in a greedy fashion. In the only notable exception, Bian, Zhou, Li, and Qian (IJCAI 2023) proposed a stochastic selection mechanism for the SMS-EMOA and proved that it can speed up computing the Pareto front of the bi-objective jump benchmark with problem size $n$ and gap parameter $k$ by a factor of $\max\{1,2^{k/4}/n\}$. While this constitutes the first proven speed-up from non-elitist selection, suggesting a very interesting research direction, it has to be noted that a true speed-up only occurs for $k \ge 4\log_2(n)$, where the runtime is super-polynomial, and that the advantage reduces for larger numbers of objectives as shown in a later work. In this work, we propose a different non-elitist selection mechanism based on aging, which exempts individuals younger than a certain age from a possible removal. This remedies the two shortcomings of stochastic selection: We prove a speed-up by a factor of $\max\{1,Θ(k)^{k-1}\}$, regardless of the number of objectives. In particular, a positive speed-up can already be observed for constant $k$, the only setting for which polynomial runtimes can be witnessed. Overall, this result supports the use of non-elitist selection schemes, but suggests that aging-based mechanisms can be considerably more powerful than stochastic selection mechanisms.

NEApr 2, 2024
Already Moderate Population Sizes Provably Yield Strong Robustness to Noise

Denis Antipov, Benjamin Doerr, Alexandra Ivanova

Experience shows that typical evolutionary algorithms can cope well with stochastic disturbances such as noisy function evaluations. In this first mathematical runtime analysis of the $(1+λ)$ and $(1,λ)$ evolutionary algorithms in the presence of prior bit-wise noise, we show that both algorithms can tolerate constant noise probabilities without increasing the asymptotic runtime on the OneMax benchmark. For this, a population size $λ$ suffices that is at least logarithmic in the problem size $n$. The only previous result in this direction regarded the less realistic one-bit noise model, required a population size super-linear in the problem size, and proved a runtime guarantee roughly cubic in the noiseless runtime for the OneMax benchmark. Our significantly stronger results are based on the novel proof argument that the noiseless offspring can be seen as a biased uniform crossover between the parent and the noisy offspring. We are optimistic that the technical lemmas resulting from this insight will find applications also in future mathematical runtime analyses of evolutionary algorithms.

NEMar 13, 2024
The Runtime of Random Local Search on the Generalized Needle Problem

Benjamin Doerr, Andrew James Kelley

In their recent work, C. Doerr and Krejca (Transactions on Evolutionary Computation, 2023) proved upper bounds on the expected runtime of the randomized local search heuristic on generalized Needle functions. Based on these upper bounds, they deduce in a not fully rigorous manner a drastic influence of the needle radius $k$ on the runtime. In this short article, we add the missing lower bound necessary to determine the influence of parameter $k$ on the runtime. To this aim, we derive an exact description of the expected runtime, which also significantly improves the upper bound given by C. Doerr and Krejca. We also describe asymptotic estimates of the expected runtime.

NEJun 1, 2025
Speeding Up Hyper-Heuristics With Markov-Chain Operator Selection and the Only-Worsening Acceptance Operator

Abderrahim Bendahi, Benjamin Doerr, Adrien Fradin et al.

The move-acceptance hyper-heuristic was recently shown to be able to leave local optima with astonishing efficiency (Lissovoi et al., Artificial Intelligence (2023)). In this work, we propose two modifications to this algorithm that demonstrate impressive performances on a large class of benchmarks including the classic Cliff$_d$ and Jump$_m$ function classes. (i) Instead of randomly choosing between the only-improving and any-move acceptance operator, we take this choice via a simple two-state Markov chain. This modification alone reduces the runtime on Jump$_m$ functions with gap parameter $m$ from $Ω(n^{2m-1})$ to $O(n^{m+1})$. (ii) We then replace the all-moves acceptance operator with the operator that only accepts worsenings. Such a, counter-intuitive, operator has not been used before in the literature. However, our proofs show that our only-worsening operator can greatly help in leaving local optima, reducing, e.g., the runtime on Jump functions to $O(n^3 \log n)$ independent of the gap size. In general, we prove a remarkably good runtime of $O(n^{k+1} \log n)$ for our Markov move-acceptance hyper-heuristic on all members of a new benchmark class SEQOPT$_k$, which contains a large number of functions having $k$ successive local optima, and which contains the commonly studied Jump$_m$ and Cliff$_d$ functions for $k=2$.

NEApr 4, 2024
A Block-Coordinate Descent EMO Algorithm: Theoretical and Empirical Analysis

Benjamin Doerr, Joshua Knowles, Aneta Neumann et al.

We consider whether conditions exist under which block-coordinate descent is asymptotically efficient in evolutionary multi-objective optimization, addressing an open problem. Block-coordinate descent, where an optimization problem is decomposed into $k$ blocks of decision variables and each of the blocks is optimized (with the others fixed) in a sequence, is a technique used in some large-scale optimization problems such as airline scheduling, however its use in multi-objective optimization is less studied. We propose a block-coordinate version of GSEMO and compare its running time to the standard GSEMO algorithm. Theoretical and empirical results on a bi-objective test function, a variant of LOTZ, serve to demonstrate the existence of cases where block-coordinate descent is faster. The result may yield wider insights into this class of algorithms.

AIMay 22, 2023
The First Proven Performance Guarantees for the Non-Dominated Sorting Genetic Algorithm II (NSGA-II) on a Combinatorial Optimization Problem

Sacha Cerf, Benjamin Doerr, Benjamin Hebras et al.

The Non-dominated Sorting Genetic Algorithm-II (NSGA-II) is one of the most prominent algorithms to solve multi-objective optimization problems. Recently, the first mathematical runtime guarantees have been obtained for this algorithm, however only for synthetic benchmark problems. In this work, we give the first proven performance guarantees for a classic optimization problem, the NP-complete bi-objective minimum spanning tree problem. More specifically, we show that the NSGA-II with population size $N \ge 4((n-1) w_{\max} + 1)$ computes all extremal points of the Pareto front in an expected number of $O(m^2 n w_{\max} \log(n w_{\max}))$ iterations, where $n$ is the number of vertices, $m$ the number of edges, and $w_{\max}$ is the maximum edge weight in the problem instance. This result confirms, via mathematical means, the good performance of the NSGA-II observed empirically. It also shows that mathematical analyses of this algorithm are not only possible for synthetic benchmark problems, but also for more complex combinatorial optimization problems. As a side result, we also obtain a new analysis of the performance of the global SEMO algorithm on the bi-objective minimum spanning tree problem, which improves the previous best result by a factor of $|F|$, the number of extremal points of the Pareto front, a set that can be as large as $n w_{\max}$. The main reason for this improvement is our observation that both multi-objective evolutionary algorithms find the different extremal points in parallel rather than sequentially, as assumed in the previous proofs.

NEMay 17, 2023
Runtime Analyses of Multi-Objective Evolutionary Algorithms in the Presence of Noise

Matthieu Dinot, Benjamin Doerr, Ulysse Hennebelle et al.

In single-objective optimization, it is well known that evolutionary algorithms also without further adjustments can tolerate a certain amount of noise in the evaluation of the objective function. In contrast, this question is not at all understood for multi-objective optimization. In this work, we conduct the first mathematical runtime analysis of a simple multi-objective evolutionary algorithm (MOEA) on a classic benchmark in the presence of noise in the objective functions. We prove that when bit-wise prior noise with rate $p \le α/n$, $α$ a suitable constant, is present, the \emph{simple evolutionary multi-objective optimizer} (SEMO) without any adjustments to cope with noise finds the Pareto front of the OneMinMax benchmark in time $O(n^2\log n)$, just as in the case without noise. Given that the problem here is to arrive at a population consisting of $n+1$ individuals witnessing the Pareto front, this is a surprisingly strong robustness to noise (comparably simple evolutionary algorithms cannot optimize the single-objective OneMax problem in polynomial time when $p = ω(\log(n)/n)$). Our proofs suggest that the strong robustness of the MOEA stems from its implicit diversity mechanism designed to enable it to compute a population covering the whole Pareto front. Interestingly this result only holds when the objective value of a solution is determined only once and the algorithm from that point on works with this, possibly noisy, objective value. We prove that when all solutions are reevaluated in each iteration, then any noise rate $p = ω(\log(n)/n^2)$ leads to a super-polynomial runtime. This is very different from single-objective optimization, where it is generally preferred to reevaluate solutions whenever their fitness is important and where examples are known such that not reevaluating solutions can lead to catastrophic performance losses.

NEMay 8, 2023
Larger Offspring Populations Help the $(1 + (λ, λ))$ Genetic Algorithm to Overcome the Noise

Alexandra Ivanova, Denis Antipov, Benjamin Doerr

Evolutionary algorithms are known to be robust to noise in the evaluation of the fitness. In particular, larger offspring population sizes often lead to strong robustness. We analyze to what extent the $(1+(λ,λ))$ genetic algorithm is robust to noise. This algorithm also works with larger offspring population sizes, but an intermediate selection step and a non-standard use of crossover as repair mechanism could render this algorithm less robust than, e.g., the simple $(1+λ)$ evolutionary algorithm. Our experimental analysis on several classic benchmark problems shows that this difficulty does not arise. Surprisingly, in many situations this algorithm is even more robust to noise than the $(1+λ)$~EA.

NEJan 28, 2022
Stagnation Detection Meets Fast Mutation

Benjamin Doerr, Amirhossein Rajabi

Two mechanisms have recently been proposed that can significantly speed up finding distant improving solutions via mutation, namely using a random mutation rate drawn from a heavy-tailed distribution ("fast mutation", Doerr et al. (2017)) and increasing the mutation strength based on stagnation detection (Rajabi and Witt (2020)). Whereas the latter can obtain the asymptotically best probability of finding a single desired solution in a given distance, the former is more robust and performs much better when many improving solutions in some distance exist. In this work, we propose a mutation strategy that combines ideas of both mechanisms. We show that it can also obtain the best possible probability of finding a single distant solution. However, when several improving solutions exist, it can outperform both the stagnation-detection approach and fast mutation. The new operator is more than an interleaving of the two previous mechanisms and it also outperforms any such interleaving.

NEDec 16, 2021
Mathematical Runtime Analysis for the Non-Dominated Sorting Genetic Algorithm II (NSGA-II)

Weijie Zheng, Benjamin Doerr

The non-dominated sorting genetic algorithm II (NSGA-II) is the most intensively used multi-objective evolutionary algorithm (MOEA) in real-world applications. However, in contrast to several simple MOEAs analyzed also via mathematical means, no such study exists for the NSGA-II so far. In this work, we show that mathematical runtime analyses are feasible also for the NSGA-II. As particular results, we prove that with a population size four times larger than the size of the Pareto front, the NSGA-II with two classic mutation operators and four different ways to select the parents satisfies the same asymptotic runtime guarantees as the SEMO and GSEMO algorithms on the basic OneMinMax and LeadingOnesTrailingZeros benchmarks. However, if the population size is only equal to the size of the Pareto front, then the NSGA-II cannot efficiently compute the full Pareto front: for an exponential number of iterations, the population will always miss a constant fraction of the Pareto front. Our experiments confirm the above findings.

NESep 14, 2021
Choosing the Right Algorithm With Hints From Complexity Theory

Shouda Wang, Weijie Zheng, Benjamin Doerr

Choosing a suitable algorithm from the myriads of different search heuristics is difficult when faced with a novel optimization problem. In this work, we argue that the purely academic question of what could be the best possible algorithm in a certain broad class of black-box optimizers can give fruitful indications in which direction to search for good established optimization heuristics. We demonstrate this approach on the recently proposed DLB benchmark, for which the only known results are $O(n^3)$ runtimes for several classic evolutionary algorithms and an $O(n^2 \log n)$ runtime for an estimation-of-distribution algorithm. Our finding that the unary unbiased black-box complexity is only $O(n^2)$ suggests the Metropolis algorithm as an interesting candidate and we prove that it solves the DLB problem in quadratic time. Since we also prove that better runtimes cannot be obtained in the class of unary unbiased algorithms, we shift our attention to algorithms that use the information of more parents to generate new solutions. An artificial algorithm of this type having an $O(n \log n)$ runtime leads to the result that the significance-based compact genetic algorithm (sig-cGA) can solve the DLB problem also in time $O(n \log n)$ with high probability. Our experiments show a remarkably good performance of the Metropolis algorithm, clearly the best of all algorithms regarded for reasonable problem sizes.

NEMay 7, 2021
An Extended Jump Functions Benchmark for the Analysis of Randomized Search Heuristics

Henry Bambury, Antoine Bultel, Benjamin Doerr

Jump functions are the {most-studied} non-unimodal benchmark in the theory of randomized search heuristics, in particular, evolutionary algorithms (EAs). They have significantly improved our understanding of how EAs escape from local optima. However, their particular structure -- to leave the local optimum one can only jump directly to the global optimum -- raises the question of how representative such results are. For this reason, we propose an extended class $\textsc{Jump}_{k,δ}$ of jump functions that contain a valley of low fitness of width $δ$ starting at distance $k$ from the global optimum. We prove that several previous results extend to this more general class: for all {$k \le \frac{n^{1/3}}{\ln{n}}$} and $δ< k$, the optimal mutation rate for the $(1+1)$~EA is $\fracδ{n}$, and the fast $(1+1)$~EA runs faster than the classical $(1+1)$~EA by a factor super-exponential in $δ$. However, we also observe that some known results do not generalize: the randomized local search algorithm with stagnation detection, which is faster than the fast $(1+1)$~EA by a factor polynomial in $k$ on $\textsc{Jump}_k$, is slower by a factor polynomial in $n$ on some $\textsc{Jump}_{k,δ}$ instances. Computationally, the new class allows experiments with wider fitness valleys, especially when they lie further away from the global optimum.

NEApr 14, 2021
Lazy Parameter Tuning and Control: Choosing All Parameters Randomly From a Power-Law Distribution

Denis Antipov, Maxim Buzdalov, Benjamin Doerr

Most evolutionary algorithms have multiple parameters and their values drastically affect the performance. Due to the often complicated interplay of the parameters, setting these values right for a particular problem (parameter tuning) is a challenging task. This task becomes even more complicated when the optimal parameter values change significantly during the run of the algorithm since then a dynamic parameter choice (parameter control) is necessary. In this work, we propose a lazy but effective solution, namely choosing all parameter values (where this makes sense) in each iteration randomly from a suitably scaled power-law distribution. To demonstrate the effectiveness of this approach, we perform runtime analyses of the $(1+(λ,λ))$ genetic algorithm with all three parameters chosen in this manner. We show that this algorithm on the one hand can imitate simple hill-climbers like the $(1+1)$ EA, giving the same asymptotic runtime on problems like OneMax, LeadingOnes, or Minimum Spanning Tree. On the other hand, this algorithm is also very efficient on jump functions, where the best static parameters are very different from those necessary to optimize simple problems. We prove a performance guarantee that is comparable to the best performance known for static parameters. For the most interesting case that the jump size $k$ is constant, we prove that our performance is asymptotically better than what can be obtained with any static parameter choice. We complement our theoretical results with a rigorous empirical study confirming what the asymptotic runtime results suggest.

NEApr 7, 2021
Lower Bounds from Fitness Levels Made Easy

Benjamin Doerr, Timo Kötzing

One of the first and easy to use techniques for proving run time bounds for evolutionary algorithms is the so-called method of fitness levels by Wegener. It uses a partition of the search space into a sequence of levels which are traversed by the algorithm in increasing order, possibly skipping levels. An easy, but often strong upper bound for the run time can then be derived by adding the reciprocals of the probabilities to leave the levels (or upper bounds for these). Unfortunately, a similarly effective method for proving lower bounds has not yet been established. The strongest such method, proposed by Sudholt (2013), requires a careful choice of the viscosity parameters $γ_{i,j}$, $0 \le i < j \le n$. In this paper we present two new variants of the method, one for upper and one for lower bounds. Besides the level leaving probabilities, they only rely on the probabilities that levels are visited at all. We show that these can be computed or estimated without greater difficulties and apply our method to reprove the following known results in an easy and natural way. (i) The precise run time of the (1+1) EA on \textsc{LeadingOnes}. (ii) A lower bound for the run time of the (1+1) EA on \textsc{OneMax}, tight apart from an $O(n)$ term. (iii) A lower bound for the run time of the (1+1) EA on long $k$-paths. We also prove a tighter lower bound for the run time of the (1+1) EA on jump functions by showing that, regardless of the jump size, only with probability $O(2^{-n})$ the algorithm can avoid to jump over the valley of low fitness.

NEFeb 1, 2021
Towards Explainable Exploratory Landscape Analysis: Extreme Feature Selection for Classifying BBOB Functions

Quentin Renau, Johann Dreo, Carola Doerr et al.

Facilitated by the recent advances of Machine Learning (ML), the automated design of optimization heuristics is currently shaking up evolutionary computation (EC). Where the design of hand-picked guidelines for choosing a most suitable heuristic has long dominated research activities in the field, automatically trained heuristics are now seen to outperform human-derived choices even for well-researched optimization tasks. ML-based EC is therefore not any more a futuristic vision, but has become an integral part of our community. A key criticism that ML-based heuristics are often faced with is their potential lack of explainability, which may hinder future developments. This applies in particular to supervised learning techniques which extrapolate algorithms' performance based on exploratory landscape analysis (ELA). In such applications, it is not uncommon to use dozens of problem features to build the models underlying the specific algorithm selection or configuration task. Our goal in this work is to analyze whether this many features are indeed needed. Using the classification of the BBOB test functions as testbed, we show that a surprisingly small number of features -- often less than four -- can suffice to achieve a 98\% accuracy. Interestingly, the number of features required to meet this threshold is found to decrease with the problem dimension. We show that the classification accuracy transfers to settings in which several instances are involved in training and testing. In the leave-one-instance-out setting, however, classification accuracy drops significantly, and the transformation-invariance of the features becomes a decisive success factor.

NEDec 14, 2020
Theoretical Analyses of Multiobjective Evolutionary Algorithms on Multimodal Objectives

Weijie Zheng, Benjamin Doerr

The theoretical understanding of MOEAs is lagging far behind their success in practice. In particular, previous theory work considers mostly easy problems that are composed of unimodal objectives. As a first step towards a deeper understanding of how evolutionary algorithms solve multimodal multiobjective problems, we propose the OJZJ problem, a bi-objective problem composed of two objectives isomorphic to the classic jump function benchmark. We prove that SEMO with probability one does not compute the full Pareto front, regardless of the runtime. In contrast, for all problem sizes $n$ and all jump sizes ${k \in [4..\frac n2 - 1]}$, the global SEMO (GSEMO) covers the Pareto front in an expected number of $Θ((n-2k)n^{k})$ iterations. For $k = o(n)$, we also show the tighter bound $\frac 32 e n^{k+1} \pm o(n^{k+1})$, which might be the first runtime bound for an MOEA that is tight apart from lower-order terms. We also combine the GSEMO with two approaches that showed advantages in single-objective multimodal problems. When using the GSEMO with a heavy-tailed mutation operator, the expected runtime improves by a factor of at least $k^{Ω(k)}$. When adapting the recent stagnation-detection strategy of Rajabi and Witt (2022) to the GSEMO, the expected runtime also improves by a factor of at least $k^{Ω(k)}$ and surpasses the heavy-tailed GSEMO by a small polynomial factor in $k$. Via an experimental analysis, we show that these asymptotic differences are visible already for small problem sizes: A factor-$5$ speed-up from heavy-tailed mutation and a factor-$10$ speed-up from stagnation detection can be observed already for jump size~$4$ and problem sizes between $10$ and $50$. Overall, our results show that the ideas recently developed to aid single-objective evolutionary algorithms to cope with local optima can be effectively employed also in multiobjective optimization.

NEJul 16, 2020
The Univariate Marginal Distribution Algorithm Copes Well With Deception and Epistasis

Benjamin Doerr, Martin S. Krejca

In their recent work, Lehre and Nguyen (FOGA 2019) show that the univariate marginal distribution algorithm (UMDA) needs time exponential in the parent populations size to optimize the DeceptiveLeadingBlocks (DLB) problem. They conclude from this result that univariate EDAs have difficulties with deception and epistasis. In this work, we show that this negative finding is caused by an unfortunate choice of the parameters of the UMDA. When the population sizes are chosen large enough to prevent genetic drift, then the UMDA optimizes the DLB problem with high probability with at most $λ(\frac{n}{2} + 2 e \ln n)$ fitness evaluations. Since an offspring population size $λ$ of order $n \log n$ can prevent genetic drift, the UMDA can solve the DLB problem with $O(n^2 \log n)$ fitness evaluations. In contrast, for classic evolutionary algorithms no better run time guarantee than $O(n^3)$ is known (which we prove to be tight for the ${(1+1)}$ EA), so our result rather suggests that the UMDA can cope well with deception and epistatis. From a broader perspective, our result shows that the UMDA can cope better with local optima than evolutionary algorithms; such a result was previously known only for the compact genetic algorithm. Together with the lower bound of Lehre and Nguyen, our result for the first time rigorously proves that running EDAs in the regime with genetic drift can lead to drastic performance losses.

NEJun 30, 2020
A Survey on Recent Progress in the Theory of Evolutionary Algorithms for Discrete Optimization

Benjamin Doerr, Frank Neumann

The theory of evolutionary computation for discrete search spaces has made significant progress in the last ten years. This survey summarizes some of the most important recent results in this research area. It discusses fine-grained models of runtime analysis of evolutionary algorithms, highlights recent theoretical insights on parameter tuning and parameter control, and summarizes the latest advances for stochastic and dynamic problems. We regard how evolutionary algorithms optimize submodular functions and we give an overview over the large body of recent results on estimation of distribution algorithms. Finally, we present the state of the art of drift analysis, one of the most powerful analysis technique developed in this field.

NEJun 22, 2020
First Steps Towards a Runtime Analysis When Starting With a Good Solution

Denis Antipov, Maxim Buzdalov, Benjamin Doerr

The mathematical runtime analysis of evolutionary algorithms traditionally regards the time an algorithm needs to find a solution of a certain quality when initialized with a random population. In practical applications it may be possible to guess solutions that are better than random ones. We start a mathematical runtime analysis for such situations. We observe that different algorithms profit to a very different degree from a better initialization. We also show that the optimal parameterization of the algorithm can depend strongly on the quality of the initial solutions. To overcome this difficulty, self-adjusting and randomized heavy-tailed parameter choices can be profitable. Finally, we observe a larger gap between the performance of the best evolutionary algorithm we found and the corresponding black-box complexity. This could suggest that evolutionary algorithms better exploiting good initial solutions are still to be found. These first findings stem from analyzing the performance of the $(1+1)$ evolutionary algorithm and the static, self-adjusting, and heavy-tailed $(1 + (λ,λ))$ GA on the OneMax benchmark. We are optimistic that the question how to profit from good initial solutions is interesting beyond these first examples.

NEJun 19, 2020
Exploratory Landscape Analysis is Strongly Sensitive to the Sampling Strategy

Quentin Renau, Carola Doerr, Johann Dreo et al.

Exploratory landscape analysis (ELA) supports supervised learning approaches for automated algorithm selection and configuration by providing sets of features that quantify the most relevant characteristics of the optimization problem at hand. In black-box optimization, where an explicit problem representation is not available, the feature values need to be approximated from a small number of sample points. In practice, uniformly sampled random point sets and Latin hypercube constructions are commonly used sampling strategies. In this work, we analyze how the sampling method and the sample size influence the quality of the feature value approximations and how this quality impacts the accuracy of a standard classification task. While, not unexpectedly, increasing the number of sample points gives more robust estimates for the feature values, to our surprise we find that the feature value approximations for different sampling strategies do not converge to the same value. This implies that approximated feature values cannot be interpreted independently of the underlying sampling strategy. As our classification experiments show, this also implies that the feature approximations used for training a classifier must stem from the same sampling strategy as those used for the actual classification tasks. As a side result we show that classifiers trained with feature values approximated by Sobol' sequences achieve higher accuracy than any of the standard sampling techniques. This may indicate improvement potential for ELA-trained machine learning models.

NEJun 8, 2020
Runtime Analysis of Evolutionary Algorithms via Symmetry Arguments

Benjamin Doerr

We use an elementary argument building on group actions to prove that the selection-free steady state genetic algorithm analyzed by Sutton and Witt (GECCO 2019) takes an expected number of $Ω(2^n / \sqrt n)$ iterations to find any particular target search point. This bound is valid for all population sizes $μ$. Our result improves over the previous lower bound of $Ω(\exp(n^{δ/2}))$ valid for population sizes $μ= O(n^{1/2 - δ})$, $0 < δ< 1/2$.

NEJun 5, 2020
Runtime Analysis of a Heavy-Tailed $(1+(λ,λ))$ Genetic Algorithm on Jump Functions

Denis Antipov, Benjamin Doerr

It was recently observed that the $(1+(λ,λ))$ genetic algorithm can comparably easily escape the local optimum of the jump functions benchmark. Consequently, this algorithm can optimize the jump function with jump size $k$ in an expected runtime of only $n^{(k + 1)/2}k^{-k/2}e^{O(k)}$ fitness evaluations (Antipov, Doerr, Karavaev (GECCO 2020)). To obtain this performance, however, a non-standard parameter setting depending on the jump size $k$ was used. To overcome this difficulty, we propose to choose two parameters of the $(1+(λ,λ))$ genetic algorithm randomly from a power-law distribution. Via a mathematical runtime analysis, we show that this algorithm with natural instance-independent choices of the distribution parameters on all jump functions with jump size at most $n/4$ has a performance close to what the best instance-specific parameters in the previous work obtained. This price for instance-independence can be made as small as an $O(n\log(n))$ factor. Given the difficulty of the jump problem and the runtime losses from using mildly suboptimal fixed parameters (also discussed in this work), this appears to be a fair price.

NEMay 2, 2020
Lower Bounds for Non-Elitist Evolutionary Algorithms via Negative Multiplicative Drift

Benjamin Doerr

A decent number of lower bounds for non-elitist population-based evolutionary algorithms has been shown by now. Most of them are technically demanding due to the (hard to avoid) use of negative drift theorems -- general results which translate an expected progress away from the target into a high hitting time. We propose a simple negative drift theorem for multiplicative drift scenarios and show that it can simplify existing analyses. We discuss in more detail Lehre's (PPSN 2010) \emph{negative drift in populations} method, one of the most general tools to prove lower bounds on the runtime of non-elitist mutation-based evolutionary algorithms for discrete search spaces. Together with other arguments, we obtain an alternative and simpler proof, which also strengthens and simplifies this method. In particular, now only three of the five technical conditions of the previous result have to be verified. The lower bounds we obtain are explicit instead of only asymptotic. This allows to compute concrete lower bounds for concrete algorithms, but also enables us to show that super-polynomial runtimes appear already when the reproduction rate is only a $(1 - ω(n^{-1/2}))$ factor below the threshold. For the special case of algorithms using standard bit mutation with a random mutation rate (called uniform mixing in the language of hyper-heuristics), we prove the result stated by Dang and Lehre (PPSN 2016) and extend it to mutation rates other than $Θ(1/n)$, which includes the heavy-tailed mutation operator proposed by Doerr, Le, Makhmara, and Nguyen (GECCO 2017). We finally apply our method and a novel domination argument to show an exponential lower bound for the runtime of the mutation-only simple genetic algorithm on \onemax for arbitrary population size.

NEApr 20, 2020
Fixed-Target Runtime Analysis

Maxim Buzdalov, Benjamin Doerr, Carola Doerr et al.

Runtime analysis aims at contributing to our understanding of evolutionary algorithms through mathematical analyses of their runtimes. In the context of discrete optimization problems, runtime analysis classically studies the time needed to find an optimal solution. However, both from a practical and from a theoretical viewpoint, more fine-grained performance measures are needed to gain a more detailed understanding of the main working principles and their resulting performance implications. Two complementary approaches have been suggested: fixed-budget analyses and fixed-target analyses. In this work, we conduct an in-depth study on the advantages and the limitations of fixed-target analyses. We show that, different from fixed-budget analyses, many classical methods from the runtime analysis of discrete evolutionary algorithms yield fixed-target results without greater effort. We use this to conduct a number of new fixed-target analyses. However, we also point out examples where an extension of existing runtime results to fixed-target results is highly non-trivial.