OCCCCRLGOct 12, 2024

Second-Order Min-Max Optimization with Lazy Hessians

arXiv:2410.09568v28 citationsh-index: 6ICLR
AI Analysis

It addresses computational bottlenecks in optimization for researchers and practitioners, but is incremental as it builds on existing methods with a specific improvement.

This paper tackles the computational complexity of second-order methods for convex-concave minimax optimization by reusing Hessians across iterations, reducing the complexity to ω((N+d^2)(d+d^{2/3}ε^{-2/3})) and improving upon previous methods by a factor of d^{1/3}.

This paper studies second-order methods for convex-concave minimax optimization. Monteiro and Svaiter (2012) proposed a method to solve the problem with an optimal iteration complexity of $\mathcal{O}(ε^{-3/2})$ to find an $ε$-saddle point. However, it is unclear whether the computational complexity, $\mathcal{O}((N+ d^2) d ε^{-2/3})$, can be improved. In the above, we follow Doikov et al. (2023) and assume the complexity of obtaining a first-order oracle as $N$ and the complexity of obtaining a second-order oracle as $dN$. In this paper, we show that the computation cost can be reduced by reusing Hessian across iterations. Our methods take the overall computational complexity of $ \tilde{\mathcal{O}}( (N+d^2)(d+ d^{2/3}ε^{-2/3}))$, which improves those of previous methods by a factor of $d^{1/3}$. Furthermore, we generalize our method to strongly-convex-strongly-concave minimax problems and establish the complexity of $\tilde{\mathcal{O}}((N+d^2) (d + d^{2/3} κ^{2/3}) )$ when the condition number of the problem is $κ$, enjoying a similar speedup upon the state-of-the-art method. Numerical experiments on both real and synthetic datasets also verify the efficiency of our method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes