Timo Kötzing

NE
22papers
278citations
Novelty44%
AI Score24

22 Papers

NEApr 7, 2021
Lower Bounds from Fitness Levels Made Easy

Benjamin Doerr, Timo Kötzing

One of the first and easy to use techniques for proving run time bounds for evolutionary algorithms is the so-called method of fitness levels by Wegener. It uses a partition of the search space into a sequence of levels which are traversed by the algorithm in increasing order, possibly skipping levels. An easy, but often strong upper bound for the run time can then be derived by adding the reciprocals of the probabilities to leave the levels (or upper bounds for these). Unfortunately, a similarly effective method for proving lower bounds has not yet been established. The strongest such method, proposed by Sudholt (2013), requires a careful choice of the viscosity parameters $γ_{i,j}$, $0 \le i < j \le n$. In this paper we present two new variants of the method, one for upper and one for lower bounds. Besides the level leaving probabilities, they only rely on the probabilities that levels are visited at all. We show that these can be computed or estimated without greater difficulties and apply our method to reprove the following known results in an easy and natural way. (i) The precise run time of the (1+1) EA on \textsc{LeadingOnes}. (ii) A lower bound for the run time of the (1+1) EA on \textsc{OneMax}, tight apart from an $O(n)$ term. (iii) A lower bound for the run time of the (1+1) EA on long $k$-paths. We also prove a tighter lower bound for the run time of the (1+1) EA on jump functions by showing that, regardless of the jump size, only with probability $O(2^{-n})$ the algorithm can avoid to jump over the valley of low fitness.

LGOct 15, 2020
Maps for Learning Indexable Classes

Julian Berger, Maximilian Böther, Vanja Doskoč et al.

We study learning of indexed families from positive data where a learner can freely choose a hypothesis space (with uniformly decidable membership) comprising at least the languages to be learned. This abstracts a very universal learning task which can be found in many areas, for example learning of (subsets of) regular languages or learning of natural languages. We are interested in various restrictions on learning, such as consistency, conservativeness or set-drivenness, exemplifying various natural learning restrictions. Building on previous results from the literature, we provide several maps (depictions of all pairwise relations) of various groups of learning criteria, including a map for monotonicity restrictions and similar criteria and a map for restrictions on data presentation. Furthermore, we consider, for various learning criteria, whether learners can be assumed consistent.

LOOct 15, 2020
Learning Languages with Decidable Hypotheses

Julian Berger, Maximilian Böther, Vanja Doskoč et al.

In language learning in the limit, the most common type of hypothesis is to give an enumerator for a language. This so-called $W$-index allows for naming arbitrary computably enumerable languages, with the drawback that even the membership problem is undecidable. In this paper we use a different system which allows for naming arbitrary decidable languages, namely programs for characteristic functions (called $C$-indices). These indices have the drawback that it is now not decidable whether a given hypothesis is even a legal $C$-index. In this first analysis of learning with $C$-indices, we give a structured account of the learning power of various restrictions employing $C$-indices, also when compared with $W$-indices. We establish a hierarchy of learning power depending on whether $C$-indices are required (a) on all outputs; (b) only on outputs relevant for the class to be learned and (c) only in the limit as final, correct hypotheses. Furthermore, all these settings are weaker than learning with $W$-indices (even when restricted to classes of computable languages). We analyze all these questions also in relation to the mode of data presentation. Finally, we also ask about the relation of semantic versus syntactic convergence and derive the map of pairwise relations for these two kinds of convergence coupled with various forms of data presentation.

LGOct 15, 2020
Normal Forms for (Semantically) Witness-Based Learners in Inductive Inference

Vanja Doskoč, Timo Kötzing

We study learners (computable devices) inferring formal languages, a setting referred to as language learning in the limit or inductive inference. In particular, we require the learners we investigate to be witness-based, that is, to justify each of their mind changes. Besides being a natural requirement for a learning task, this restriction deserves special attention as it is a specialization of various important learning paradigms. In particular, with the help of witness-based learning, explanatory learners are shown to be equally powerful under these seemingly incomparable paradigms. Nonetheless, until now, witness-based learners have only been studied sparsely. In this work, we conduct a thorough study of these learners both when requiring syntactic and semantic convergence and obtain normal forms thereof. In the former setting, we extend known results such that they include witness-based learning and generalize these to hold for a variety of learners. Transitioning to behaviourally correct learning, we also provide normal forms for semantically witness-based learners. Most notably, we show that set-driven globally semantically witness-based learners are equally powerful as their Gold-style semantically conservative counterpart. Such results are key to understanding the, yet undiscovered, mutual relation between various important learning paradigms when learning behaviourally correctly.

LGOct 15, 2020
Mapping Monotonic Restrictions in Inductive Inference

Vanja Doskoč, Timo Kötzing

In language learning in the limit we investigate computable devices (learners) learning formal languages. Through the years, many natural restrictions have been imposed on the studied learners. As such, monotonic restrictions always enjoyed particular attention as, although being a natural requirement, monotonic learners show significantly diverse behaviour when studied in different settings. A recent study thoroughly analysed the learning capabilities of strongly monotone learners imposed with memory restrictions and various additional requirements. The unveiled differences between explanatory and behaviourally correct such learners motivate our studies of monotone learners dealing with the same restrictions. We reveal differences and similarities between monotone learners and their strongly monotone counterpart when studied with various additional restrictions. In particular, we show that explanatory monotone learners, although known to be strictly stronger, do (almost) preserve the pairwise relation as seen in strongly monotone learning. Contrasting this similarity, we find substantial differences when studying behaviourally correct monotone learners. Most notably, we show that monotone learners, as opposed to their strongly monotone counterpart, do heavily rely on the order the information is given in, an unusual result for behaviourally correct learners.

FLOct 9, 2020
Learning Languages in the Limit from Positive Information with Finitely Many Memory Changes

Timo Kötzing, Karen Seidel

We investigate learning collections of languages from texts by an inductive inference machine with access to the current datum and a bounded memory in form of states. Such a bounded memory states (BMS) learner is considered successful in case it eventually settles on a correct hypothesis while exploiting only finitely many different states. We give the complete map of all pairwise relations for an established collection of criteria of successfull learning. Most prominently, we show that non-U-shapedness is not restrictive, while conservativeness and (strong) monotonicity are. Some results carry over from iterative learning by a general lemma showing that, for a wealth of restrictions (the semantic restrictions), iterative and bounded memory states learning are equivalent. We also give an example of a non-semantic restriction (strongly non-U-shapedness) where the two settings differ.

LGOct 7, 2020
Learning Half-Spaces and other Concept Classes in the Limit with Iterative Learners

Ardalan Khazraei, Timo Kötzing, Karen Seidel

In order to model an efficient learning paradigm, iterative learning algorithms access data one by one, updating the current hypothesis without regress to past data. Past research on iterative learning analyzed for example many important additional requirements and their impact on iterative learners. In this paper, our results are twofold. First, we analyze the relative learning power of various settings of iterative learning, including learning from text and from informant, as well as various further restrictions, for example we show that strongly non-U-shaped learning is restrictive for iterative learning from informant. Second, we investigate the learnability of the concept class of half-spaces and provide a constructive iterative algorithm to learn the set of half-spaces from informant.

NEJun 12, 2020
Improved Fixed-Budget Results via Drift Analysis

Timo Kötzing, Carsten Witt

Fixed-budget theory is concerned with computing or bounding the fitness value achievable by randomized search heuristics within a given budget of fitness function evaluations. Despite recent progress in fixed-budget theory, there is a lack of general tools to derive such results. We transfer drift theory, the key tool to derive expected optimization times, to the fixed-budged perspective. A first and easy-to-use statement concerned with iterating drift in so-called greed-admitting scenarios immediately translates into bounds on the expected function value. Afterwards, we consider a more general tool based on the well-known variable drift theorem. Applications of this technique to the LeadingOnes benchmark function yield statements that are more precise than the previous state of the art.

NEApr 11, 2019
Multiplicative Up-Drift

Benjamin Doerr, Timo Kötzing

Drift analysis aims at translating the expected progress of an evolutionary algorithm (or more generally, a random process) into a probabilistic guarantee on its run time (hitting time). So far, drift arguments have been successfully employed in the rigorous analysis of evolutionary algorithms, however, only for the situation that the progress is constant or becomes weaker when approaching the target. Motivated by questions like how fast fit individuals take over a population, we analyze random processes exhibiting a $(1+δ)$-multiplicative growth in expectation. We prove a drift theorem translating this expected progress into a hitting time. This drift theorem gives a simple and insightful proof of the level-based theorem first proposed by Lehre (2011). Our version of this theorem has, for the first time, the best-possible near-linear dependence on $1/δ$ (the previous results had an at least near-quadratic dependence), and it only requires a population size near-linear in $δ$ (this was super-quadratic in previous results). These improvements immediately lead to stronger run time guarantees for a number of applications. We also discuss the case of large $δ$ and show stronger results for this setting.

NEJun 6, 2018
Bounding Bloat in Genetic Programming

Benjamin Doerr, Timo Kötzing, J. A. Gregor Lagodzinski et al.

While many optimization problems work with a fixed number of decision variables and thus a fixed-length representation of possible solutions, genetic programming (GP) works on variable-length representations. A naturally occurring problem is that of bloat (unnecessary growth of solutions) slowing down optimization. Theoretical analyses could so far not bound bloat and required explicit assumptions on the magnitude of bloat. In this paper we analyze bloat in mutation-based genetic programming for the two test functions ORDER and MAJORITY. We overcome previous assumptions on the magnitude of bloat and give matching or close-to-matching upper and lower bounds for the expected optimization time. In particular, we show that the (1+1) GP takes (i) $Θ(T_{init} + n \log n)$ iterations with bloat control on ORDER as well as MAJORITY; and (ii) $O(T_{init} \log T_{init} + n (\log n)^3)$ and $Ω(T_{init} + n \log n)$ (and $Ω(T_{init} \log T_{init})$ for $n=1$) iterations without bloat control on MAJORITY.

NEJun 4, 2018
Ring Migration Topology Helps Bypassing Local Optima

Clemens Frahnow, Timo Kötzing

Running several evolutionary algorithms in parallel and occasionally exchanging good solutions is referred to as island models. The idea is that the independence of the different islands leads to diversity, thus possibly exploring the search space better. Many theoretical analyses so far have found a complete (or sufficiently quickly expanding) topology as underlying migration graph most efficient for optimization, even though a quick dissemination of individuals leads to a loss of diversity. We suggest a simple fitness function FORK with two local optima parametrized by $r \geq 2$ and a scheme for composite fitness functions. We show that, while the (1+1) EA gets stuck in a bad local optimum and incurs a run time of $Θ(n^{2r})$ fitness evaluations on FORK, island models with a complete topology can achieve a run time of $Θ(n^{1.5r})$ by making use of rare migrations in order to explore the search space more effectively. Finally, the ring topology, making use of rare migrations and a large diameter, can achieve a run time of $\tildeΘ(n^r)$, the black box complexity of FORK. This shows that the ring topology can be preferable over the complete topology in order to maintain diversity.

NEMay 25, 2018
Destructiveness of Lexicographic Parsimony Pressure and Alleviation by a Concatenation Crossover in Genetic Programming

Timo Kötzing, J. A. Gregor Lagodzinski, Johannes Lengler et al.

For theoretical analyses there are two specifics distinguishing GP from many other areas of evolutionary computation. First, the variable size representations, in particular yielding a possible bloat (i.e. the growth of individuals with redundant parts). Second, the role and realization of crossover, which is particularly central in GP due to the tree-based representation. Whereas some theoretical work on GP has studied the effects of bloat, crossover had a surprisingly little share in this work. We analyze a simple crossover operator in combination with local search, where a preference for small solutions minimizes bloat (lexicographic parsimony pressure); the resulting algorithm is denoted Concatenation Crossover GP. For this purpose three variants of the well-studied MAJORITY test function with large plateaus are considered. We show that the Concatenation Crossover GP can efficiently optimize these test functions, while local search cannot be efficient for all three variants independent of employing bloat control.

PRMay 22, 2018
First-Hitting Times Under Additive Drift

Timo Kötzing, Martin S. Krejca

For the last ten years, almost every theoretical result concerning the expected run time of a randomized search heuristic used drift theory, making it the arguably most important tool in this domain. Its success is due to its ease of use and its powerful result: drift theory allows the user to derive bounds on the expected first-hitting time of a random process by bounding expected local changes of the process -- the drift. This is usually far easier than bounding the expected first-hitting time directly. Due to the widespread use of drift theory, it is of utmost importance to have the best drift theorems possible. We improve the fundamental additive, multiplicative, and variable drift theorems by stating them in a form as general as possible and providing examples of why the restrictions we keep are still necessary. Our additive drift theorem for upper bounds only requires the process to be nonnegative, that is, we remove unnecessary restrictions like a finite, discrete, or bounded search space. As corollaries, the same is true for our upper bounds in the case of variable and multiplicative drift.

FLJan 31, 2018
Learning from Informants: Relations between Learning Success Criteria

Martin Aschenbach, Timo Kötzing, Karen Seidel

Learning from positive and negative information, so-called \emph{informants}, being one of the models for human and machine learning introduced by E.~M.~Gold, is investigated. Particularly, naturally arising questions about this learning setting, originating in results on learning from solely positive information, are answered. By a carefully arranged argument learners can be assumed to only change their hypothesis in case it is inconsistent with the data (such a learning behavior is called \emph{conservative}). The deduced main theorem states the relations between the most important delayable learning success criteria, being the ones not ruined by a delayed in time hypothesis output. Additionally, our investigations concerning the non-delayable requirement of consistent learning underpin the claim for \emph{delayability} being the right structural property to gain a deeper understanding concerning the nature of learning success criteria. Moreover, we obtain an anomalous \emph{hierarchy} when allowing for an increasing finite number of \emph{anomalies} of the hypothesized language by the learner compared with the language to be learned. In contrast to the vacillatory hierarchy for learning from solely positive information, we observe a \emph{duality} depending on whether infinitely many \emph{vacillations} between different (almost) correct hypotheses are still considered a successful learning behavior.

AISep 13, 2016
A Generic Bet-and-run Strategy for Speeding Up Traveling Salesperson and Minimum Vertex Cover

Tobias Friedrich, Timo Kötzing, Markus Wagner

A common strategy for improving optimization algorithms is to restart the algorithm when it is believed to be trapped in an inferior part of the search space. However, while specific restart strategies have been developed for specific problems (and specific algorithms), restarts are typically not regarded as a general tool to speed up an optimization algorithm. In fact, many optimization algorithms do not employ restarts at all. Recently, "bet-and-run" was introduced in the context of mixed-integer programming, where first a number of short runs with randomized initial conditions is made, and then the most promising run of these is continued. In this article, we consider two classical NP-complete combinatorial optimization problems, traveling salesperson and minimum vertex cover, and study the effectiveness of different bet-and-run strategies. In particular, our restart strategies do not take any problem knowledge into account, nor are tailored to the optimization algorithm. Therefore, they can be used off-the-shelf. We observe that state-of-the-art solvers for these problems can benefit significantly from restarts on standard benchmark instances.

NEAug 10, 2016
Escaping Local Optima using Crossover with Emergent or Reinforced Diversity

Duc-Cuong Dang, Tobias Friedrich, Timo Kötzing et al.

Population diversity is essential for avoiding premature convergence in Genetic Algorithms (GAs) and for the effective use of crossover. Yet the dynamics of how diversity emerges in populations are not well understood. We use rigorous runtime analysis to gain insight into population dynamics and GA performance for the ($μ$+1) GA and the $\text{Jump}_k$ test function. We show that the interplay of crossover and mutation may serve as a catalyst leading to a sudden burst of diversity. This leads to improvements of the expected optimisation time of order $Ω(n/\log n)$ compared to mutation-only algorithms like (1+1) EA. Moreover, increasing the mutation rate by an arbitrarily small constant factor can facilitate the generation of diversity, leading to speedups of order $Ω(n)$. We also compare seven commonly used diversity mechanisms and evaluate their impact on runtime bounds for the ($μ$+1) GA. All previous results in this context only hold for unrealistically low crossover probability $p_c=O(k/n)$, while we give analyses for the setting of constant $p_c < 1$ in all but one case. For the typical case of constant $k > 2$ and constant $p_c$, we can compare the resulting expected runtimes for different diversity mechanisms assuming an optimal choice of $μ$: $O(n^{k-1})$ for duplicate elimination/minim., $O(n^2\log n)$ for maximising the convex hull, $O(n\log n)$ for deterministic crowding (assuming $p_c = k/n$), $O(n\log n)$ for maximising Hamming distance, $O(n\log n)$ for fitness sharing, $O(n\log n)$ for single-receiver island model. This proves a sizeable advantage of all variants of the ($μ$+1) GA compared to (1+1) EA, which requires time $Θ(n^k)$. Experiments complement our theoretical findings and further highlight the benefits of crossover and diversity on $\text{Jump}_k$.

NEApr 12, 2016
The Right Mutation Strength for Multi-Valued Decision Variables

Benjamin Doerr, Carola Doerr, Timo Kötzing

The most common representation in evolutionary computation are bit strings. This is ideal to model binary decision variables, but less useful for variables taking more values. With very little theoretical work existing on how to use evolutionary algorithms for such optimization problems, we study the run time of simple evolutionary algorithms on some OneMax-like functions defined over $Ω= \{0, 1, \dots, r-1\}^n$. More precisely, we regard a variety of problem classes requesting the component-wise minimization of the distance to an unknown target vector $z \in Ω$. For such problems we see a crucial difference in how we extend the standard-bit mutation operator to these multi-valued domains. While it is natural to select each position of the solution vector to be changed independently with probability $1/n$, there are various ways to then change such a position. If we change each selected position to a random value different from the original one, we obtain an expected run time of $Θ(nr \log n)$. If we change each selected position by either $+1$ or $-1$ (random choice), the optimization time reduces to $Θ(nr + n\log n)$. If we use a random mutation strength $i \in \{0,1,\ldots,r-1\}^n$ with probability inversely proportional to $i$ and change the selected position by either $+i$ or $-i$ (random choice), then the optimization time becomes $Θ(n \log(r)(\log(n)+\log(r)))$, bringing down the dependence on $r$ from linear to polylogarithmic. One of our results depends on a new variant of the lower bounding multiplicative drift theorem.

NEJun 19, 2015
Solving Problems with Unknown Solution Length at (Almost) No Extra Cost

Benjamin Doerr, Carola Doerr, Timo Kötzing

Most research in the theory of evolutionary computation assumes that the problem at hand has a fixed problem size. This assumption does not always apply to real-world optimization challenges, where the length of an optimal solution may be unknown a priori. Following up on previous work of Cathabard, Lehre, and Yao [FOGA 2011] we analyze variants of the (1+1) evolutionary algorithm for problems with unknown solution length. For their setting, in which the solution length is sampled from a geometric distribution, we provide mutation rates that yield an expected optimization time that is of the same order as that of the (1+1) EA knowing the solution length. We then show that almost the same run times can be achieved even if \emph{no} a priori information on the solution length is available. Finally, we provide mutation rates suitable for settings in which neither the solution length nor the positions of the relevant bits are known. Again we obtain almost optimal run times for the \textsc{OneMax} and \textsc{LeadingOnes} test functions, thus solving an open problem from Cathabard et al.

NEFeb 10, 2015
The Benefit of Sex in Noisy Evolutionary Search

Tobias Friedrich, Timo Kötzing, Martin Krejca et al.

The benefit of sexual recombination is one of the most fundamental questions both in population genetics and evolutionary computation. It is widely believed that recombination helps solving difficult optimization problems. We present the first result, which rigorously proves that it is beneficial to use sexual recombination in an uncertain environment with a noisy fitness function. For this, we model sexual recombination with a simple estimation of distribution algorithm called the Compact Genetic Algorithm (cGA), which we compare with the classical $μ+1$ EA. For a simple noisy fitness function with additive Gaussian posterior noise $\mathcal{N}(0,σ^2)$, we prove that the mutation-only $μ+1$ EA typically cannot handle noise in polynomial time for $σ^2$ large enough while the cGA runs in polynomial time as long as the population size is not too small. This shows that in this uncertain environment sexual recombination is provably beneficial. We observe the same behavior in a small empirical study.

LGApr 29, 2014
A Map of Update Constraints in Inductive Inference

Timo Kötzing, Raphaela Palenta

We investigate how different learning restrictions reduce learning power and how the different restrictions relate to one another. We give a complete map for nine different restrictions both for the cases of complete information learning and set-driven learning. This completes the picture for these well-studied \emph{delayable} learning restrictions. A further insight is gained by different characterizations of \emph{conservative} learning in terms of variants of \emph{cautious} learning. Our analyses greatly benefit from general theorems we give, for example showing that learners with exclusively delayable restrictions can always be assumed total.

NEMar 30, 2014
Unbiased Black-Box Complexities of Jump Functions

Benjamin Doerr, Carola Doerr, Timo Kötzing

We analyze the unbiased black-box complexity of jump functions with small, medium, and large sizes of the fitness plateau surrounding the optimal solution. Among other results, we show that when the jump size is $(1/2 - \varepsilon)n$, that is, only a small constant fraction of the fitness values is visible, then the unbiased black-box complexities for arities $3$ and higher are of the same order as those for the simple \textsc{OneMax} function. Even for the extreme jump function, in which all but the two fitness values $n/2$ and $n$ are blanked out, polynomial-time mutation-based (i.e., unary unbiased) black-box optimization algorithms exist. This is quite surprising given that for the extreme jump function almost the whole search space (all but a $Θ(n^{-1/2})$ fraction) is a plateau of constant fitness. To prove these results, we introduce new tools for the analysis of unbiased black-box complexities, for example, selecting the new parent individual not by comparing the fitnesses of the competing search points, but also by taking into account the (empirical) expected fitnesses of their offspring.

NEJul 2, 2012
More Effective Crossover Operators for the All-Pairs Shortest Path Problem

Benjamin Doerr, Daniel Johannsen, Timo Kötzing et al.

The all-pairs shortest path problem is the first non-artificial problem for which it was shown that adding crossover can significantly speed up a mutation-only evolutionary algorithm. Recently, the analysis of this algorithm was refined and it was shown to have an expected optimization time (w.r.t. the number of fitness evaluations) of $Θ(n^{3.25}(\log n)^{0.25})$. In contrast to this simple algorithm, evolutionary algorithms used in practice usually employ refined recombination strategies in order to avoid the creation of infeasible offspring. We study extensions of the basic algorithm by two such concepts which are central in recombination, namely \emph{repair mechanisms} and \emph{parent selection}. We show that repairing infeasible offspring leads to an improved expected optimization time of $\mathord{O}(n^{3.2}(\log n)^{0.2})$. As a second part of our study we prove that choosing parents that guarantee feasible offspring results in an even better optimization time of $\mathord{O}(n^{3}\log n)$. Both results show that already simple adjustments of the recombination operator can asymptotically improve the runtime of evolutionary algorithms.