Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution Strategies
This addresses gradient estimation problems in machine learning for scenarios where automatic differentiation fails, offering a more parallelizable and efficient alternative.
The paper tackles the challenge of gradient estimation in unrolled computation graphs with sensitive or blackbox loss functions by proposing Noise-Reuse Evolution Strategies (NRES), which achieves faster convergence than existing methods in applications like learning dynamical systems and reinforcement learning.
Unrolled computation graphs are prevalent throughout machine learning but present challenges to automatic differentiation (AD) gradient estimation methods when their loss functions exhibit extreme local sensitivtiy, discontinuity, or blackbox characteristics. In such scenarios, online evolution strategies methods are a more capable alternative, while being more parallelizable than vanilla evolution strategies (ES) by interleaving partial unrolls and gradient updates. In this work, we propose a general class of unbiased online evolution strategies methods. We analytically and empirically characterize the variance of this class of gradient estimators and identify the one with the least variance, which we term Noise-Reuse Evolution Strategies (NRES). Experimentally, we show NRES results in faster convergence than existing AD and ES methods in terms of wall-clock time and number of unroll steps across a variety of applications, including learning dynamical systems, meta-training learned optimizers, and reinforcement learning.