GT LGMay 13

When and Why is Optimistic Multiplicative Weights Slow? The Geometry of Energy Dissipation

John Lazarsfeld, Anas Barakat, Georgios Piliouras, Antonios Varvitsiotis, Andre Wibisono

arXiv:2605.1324235.6

Predicted impact top 25% in GT · last 90 daysOriginality Highly original

AI Analysis

For researchers in game theory and online learning, this work provides a sharp geometric understanding of OMWU's convergence behavior, resolving open questions about slow convergence and establishing optimal rates.

The paper studies the Optimistic Multiplicative Weights Update (OMWU) algorithm in two-player zero-sum games, developing a new analysis framework that quantifies when and why slow convergence occurs. It proves a new linear last-iterate convergence rate in KL divergence for games with a unique interior Nash equilibrium, with optimal dependence on game constants, and establishes new separations in uniform convergence rates across different distance measures.

This paper studies the convergence of the Optimistic Multiplicative Weights Update algorithm (OMWU) in two player zero-sum games. Recent works have identified instances on which the last-iterate of OMWU can converge arbitrarily slowly, but understanding when and why this slow convergence occurs has remained open. In this work, we develop a new analysis framework that gives sharp, quantitative explanations for this behavior. Our analysis is based on viewing the algorithm's dual iterates as an optimistic skew-gradient descent with respect to an energy function. We prove over the dual iterates that energy is dissipative, and by establishing tight bounds on the magnitude of dissipation, our analysis quantifies the geometric bottlenecks that arise when the corresponding primal iterates are close to the simplex boundary. This further translates into a new linear last-iterate convergence rate in KL divergence on games with a unique and interior Nash equilibrium. Compared to prior work, this new rate contains a much sharper dependence on game-specific constants, and we prove this dependence is optimal. Moreover, these geometric insights further translate into new separations on uniform convergence rates for OMWU. On the one hand, we prove constant lower bounds on the uniform best-iterate convergence rate in KL divergence and total variation distance from Nash. On the other hand, we establish for the $2\times 2$ setting a new ${\widetilde O}(T^{-1/2})$ best-iterate rate in duality gap, improving substantially over prior work. Together, this shows in general that uniform convergence rate guarantees do not transfer across different measures of distance to Nash.

View on arXiv PDF

Similar