SDEs for Minimax Optimization
This work addresses the problem of understanding and improving minimax optimization for researchers and practitioners in fields like machine learning and economics, but it is incremental as it applies an existing mathematical framework (SDEs) to a known bottleneck in optimization analysis.
The paper tackled the challenge of analyzing minimax optimization dynamics in stochastic scenarios by pioneering the use of stochastic differential equations (SDEs) to model and compare optimizers like Stochastic Gradient Descent-Ascent, Stochastic Extragradient, and Stochastic Hamiltonian Gradient Descent, resulting in provable approximations, unified analysis strategies, and derived convergence conditions and closed-form solutions.
Minimax optimization problems have attracted a lot of attention over the past few years, with applications ranging from economics to machine learning. While advanced optimization methods exist for such problems, characterizing their dynamics in stochastic scenarios remains notably challenging. In this paper, we pioneer the use of stochastic differential equations (SDEs) to analyze and compare Minimax optimizers. Our SDE models for Stochastic Gradient Descent-Ascent, Stochastic Extragradient, and Stochastic Hamiltonian Gradient Descent are provable approximations of their algorithmic counterparts, clearly showcasing the interplay between hyperparameters, implicit regularization, and implicit curvature-induced noise. This perspective also allows for a unified and simplified analysis strategy based on the principles of Itô calculus. Finally, our approach facilitates the derivation of convergence conditions and closed-form solutions for the dynamics in simplified settings, unveiling further insights into the behavior of different optimizers.