MLAug 7, 2023
Generalized Early Stopping in Evolutionary Direct Policy SearchEtor Arza, Leni K. Le Goff, Emma Hart
Lengthy evaluation times are common in many optimization problems such as direct policy search tasks, especially when they involve conducting evaluations in the physical world, e.g. in robotics applications. Often when evaluating solution over a fixed time period it becomes clear that the objective value will not increase with additional computation time (for example when a two wheeled robot continuously spins on the spot). In such cases, it makes sense to stop the evaluation early to save computation time. However, most approaches to stop the evaluation are problem specific and need to be specifically designed for the task at hand. Therefore, we propose an early stopping method for direct policy search. The proposed method only looks at the objective value at each time step and requires no problem specific knowledge. We test the introduced stopping criterion in five direct policy search environments drawn from games, robotics and classic control domains, and show that it can save up to 75% of the computation time. We also compare it with problem specific stopping criteria and show that it performs comparably, while being more generally applicable.
MLMar 15, 2022
Comparing Two Samples Through Stochastic Dominance: A Graphical ApproachEtor Arza, Josu Ceberio, Ekhiñe Irurozki et al.
Non-deterministic measurements are common in real-world scenarios: the performance of a stochastic optimization algorithm or the total reward of a reinforcement learning agent in a chaotic environment are just two examples in which unpredictable outcomes are common. These measures can be modeled as random variables and compared among each other via their expected values or more sophisticated tools such as null hypothesis statistical tests. In this paper, we propose an alternative framework to visually compare two samples according to their estimated cumulative distribution functions. First, we introduce a dominance measure for two random variables that quantifies the proportion in which the cumulative distribution function of one of the random variables stochastically dominates the other one. Then, we present a graphical method that decomposes in quantiles i) the proposed dominance measure and ii) the probability that one of the random variables takes lower values than the other. With illustrative purposes, we re-evaluate the experimentation of an already published work with the proposed methodology and we show that additional conclusions (missed by the rest of the methods) can be inferred. Additionally, the software package RVCompare was created as a convenient way of applying and experimenting with the proposed framework.
ROSep 13, 2024
An Empirical Study on the Computation Budget of Co-Optimization of Robot Design and Control in SimulationEtor Arza, Frank Veenstra, Tønnes F. Nygaard et al.
The design (shape) of a robot is usually decided before the control is implemented. This might limit how well the design is adapted to a task, as the suitability of the design is given by how well the robot performs in the task, which requires both a design and a controller. The co-optimization or simultaneous optimization of the design and control of robots addresses this limitation by producing a design and control that are both adapted to the task. This paper investigates some of the challenges inherent in the co-optimization of design and control in simulation. The results show that reducing how well the controllers are trained during the co-optimization process significantly improves the robot's performance when considering a second phase in which the controller for the best design is retrained with additional resources. In addition, the results demonstrate that the computation budget allocated to training the controller for each design influences design complexity, with simpler designs associated with lower training budgets. This paper experimentally studies key questions discussed in other works in the literature on the co-optimization of design and control of robots in simulation in four different co-optimization problems.
LGSep 30, 2025
A Review on Single-Problem Multi-Attempt Heuristic OptimizationJudith Echevarrieta, Etor Arza, Aritz Pérez et al.
In certain real-world optimization scenarios, practitioners are not interested in solving multiple problems but rather in finding the best solution to a single, specific problem. When the computational budget is large relative to the cost of evaluating a candidate solution, multiple heuristic alternatives can be tried to solve the same given problem, each possibly with a different algorithm, parameter configuration, initialization, or stopping criterion. The sequential selection of which alternative to try next is crucial for efficiently identifying the one that provides the best possible solution across multiple attempts. Despite the relevance of this problem in practice, it has not yet been the exclusive focus of any existing review. Several sequential alternative selection strategies have been proposed in different research topics, but they have not been comprehensively and systematically unified under a common perspective. This work presents a focused review of single-problem multi-attempt heuristic optimization. It brings together suitable strategies to this problem that have been studied separately through algorithm selection, parameter tuning, multi-start and resource allocation. These strategies are explained using a unified terminology within a common framework, which supports the development of a taxonomy for systematically organizing and classifying them.
MLOct 19, 2019
Kernels of Mallows Models under the Hamming Distance for solving the Quadratic Assignment ProblemEtor Arza, Aritz Perez, Ekhine Irurozki et al.
The Quadratic Assignment Problem (QAP) is a well-known permutation-based combinatorial optimization problem with real applications in industrial and logistics environments. Motivated by the challenge that this NP-hard problem represents, it has captured the attention of the optimization community for decades. As a result, a large number of algorithms have been proposed to tackle this problem. Among these, exact methods are only able to solve instances of size $n<40$. To overcome this limitation, many metaheuristic methods have been applied to the QAP. In this work, we follow this direction by approaching the QAP through Estimation of Distribution Algorithms (EDAs). Particularly, a non-parametric distance-based exponential probabilistic model is used. Based on the analysis of the characteristics of the QAP, and previous work in the area, we introduce Kernels of Mallows Model under the Hamming distance to the context of EDAs. Conducted experiments point out that the performance of the proposed algorithm in the QAP is superior to (i) the classical EDAs adapted to deal with the QAP, and also (ii) to the specific EDAs proposed in the literature to deal with permutation problems.