Symmetry Teleportation for Accelerated Optimization
This addresses the problem of slow optimization convergence for machine learning practitioners, offering a novel approach that is incremental in its application to existing methods.
The paper tackles the slow convergence of gradient-based optimization by introducing symmetry teleportation, which moves parameters along loss level sets using symmetries to accelerate subsequent steps, showing improved convergence speed in experiments on test functions, multi-layer regressions, and MNIST classification.
Existing gradient-based optimization methods update parameters locally, in a direction that minimizes the loss function. We study a different approach, symmetry teleportation, that allows parameters to travel a large distance on the loss level set, in order to improve the convergence speed in subsequent steps. Teleportation exploits symmetries in the loss landscape of optimization problems. We derive loss-invariant group actions for test functions in optimization and multi-layer neural networks, and prove a necessary condition for teleportation to improve convergence rate. We also show that our algorithm is closely related to second order methods. Experimentally, we show that teleportation improves the convergence speed of gradient descent and AdaGrad for several optimization problems including test functions, multi-layer regressions, and MNIST classification.