NELGJan 23, 2024

DALex: Lexicase-like Selection via Diverse Aggregation

arXiv:2401.12424v29 citationsh-index: 6EuroGP
Originality Incremental advance
AI Analysis

This work addresses a bottleneck for researchers and practitioners in evolutionary computation and machine learning by providing a faster alternative to lexicase selection, though it is incremental as it builds on existing selection methods.

The paper tackles the computational inefficiency of lexicase selection in evolutionary computation by proposing DALex, a method that uses randomly weighted sums of training case errors to achieve nearly identical selection outcomes with significant speedups, as demonstrated across multiple domains like program synthesis and deep learning.

Lexicase selection has been shown to provide advantages over other selection algorithms in several areas of evolutionary computation and machine learning. In its standard form, lexicase selection filters a population or other collection based on randomly ordered training cases that are considered one at a time. This iterated filtering process can be time-consuming, particularly in settings with large numbers of training cases. In this paper, we propose a new method that is nearly equivalent to lexicase selection in terms of the individuals that it selects, but which does so significantly more quickly. The new method, called DALex (for Diversely Aggregated Lexicase), selects the best individual with respect to a weighted sum of training case errors, where the weights are randomly sampled. This allows us to formulate the core computation required for selection as matrix multiplication instead of recursive loops of comparisons, which in turn allows us to take advantage of optimized and parallel algorithms designed for matrix multiplication for speedup. Furthermore, we show that we can interpolate between the behavior of lexicase selection and its "relaxed" variants, such as epsilon or batch lexicase selection, by adjusting a single hyperparameter, named "particularity pressure," which represents the importance granted to each individual training case. Results on program synthesis, deep learning, symbolic regression, and learning classifier systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants while maintaining almost identical problem-solving performance. Under a fixed computational budget, these savings free up resources that can be directed towards increasing population size or the number of generations, enabling the potential for solving more difficult problems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes