Min-Max Optimization under Delays
This addresses a gap in theory for min-max optimization under delays, which is important for applications in adversarial robustness, game theory, and reinforcement learning, but it is incremental as it extends existing algorithms to delayed settings.
The paper tackles the problem of min-max optimization under delayed gradient updates, showing that small delays can cause algorithms like Extra-gradient to diverge, but proving that Gradient Descent-Ascent and Extra-gradient with delays still converge to saddle points under certain conditions, with complexity bounds revealing a slow-down in convergence.
Delays and asynchrony are inevitable in large-scale machine-learning problems where communication plays a key role. As such, several works have extensively analyzed stochastic optimization with delayed gradients. However, as far as we are aware, no analogous theory is available for min-max optimization, a topic that has gained recent popularity due to applications in adversarial robustness, game theory, and reinforcement learning. Motivated by this gap, we examine the performance of standard min-max optimization algorithms with delayed gradient updates. First, we show (empirically) that even small delays can cause prominent algorithms like Extra-gradient (\texttt{EG}) to diverge on simple instances for which \texttt{EG} guarantees convergence in the absence of delays. Our empirical study thus suggests the need for a careful analysis of delayed versions of min-max optimization algorithms. Accordingly, under suitable technical assumptions, we prove that Gradient Descent-Ascent (\texttt{GDA}) and \texttt{EG} with delayed updates continue to guarantee convergence to saddle points for convex-concave and strongly convex-strongly concave settings. Our complexity bounds reveal, in a transparent manner, the slow-down in convergence caused by delays.