Non-local Optimization: Imposing Structure on Optimization Problems by Relaxation
This work provides theoretical insights for researchers in evolutionary computation and reinforcement learning, though it appears incremental as it builds on existing relaxation methods.
The paper tackles the problem of optimizing non-differentiable or non-convex functions in stochastic optimization by analyzing the structure of relaxations, showing that properties like consistency of optimal values, Lipschitz gradients, and convexity enable fast and reliable optimization.
In stochastic optimization, particularly in evolutionary computation and reinforcement learning, the optimization of a function $f: Ω\to \mathbb{R}$ is often addressed through optimizing a so-called relaxation $θ\in Θ\mapsto \mathbb{E}_θ(f)$ of $f$, where $Θ$ resembles the parameters of a family of probability measures on $Ω$. We investigate the structure of such relaxations by means of measure theory and Fourier analysis, enabling us to shed light on the success of many associated stochastic optimization methods. The main structural traits we derive and that allow fast and reliable optimization of relaxations are the consistency of optimal values of $f$, Lipschitzness of gradients, and convexity. We emphasize settings where $f$ itself is not differentiable or convex, e.g., in the presence of (stochastic) disturbance.