OC CC LGOct 23, 2022

Explicit Second-Order Min-Max Optimization: Practical Algorithms and Complexity Analysis

Tianyi Lin, Panayotis Mertikopoulos, Michael I. Jordan

arXiv:2210.12860v65.62 citationsh-index: 163

Originality Incremental advance

AI Analysis

This work addresses a gap in second-order methods for min-max optimization, offering practical algorithms with improved complexity for researchers and practitioners in optimization and machine learning, though it is incremental as it builds on existing extra-gradient methods.

The paper tackles the problem of finding global saddle points in convex-concave min-max optimization by proposing inexact regularized Newton-type methods, achieving convergence to an ε-saddle point within O(ε^{-2/3}) iterations and reducing computational overhead by shaving off an O(log log(1/ε)) factor in Schur decompositions compared to existing methods.

We propose and analyze several inexact regularized Newton-type methods for finding a global saddle point of \emph{convex-concave} unconstrained min-max optimization problems. Compared to first-order methods, our understanding of second-order methods for min-max optimization is relatively limited, as obtaining global rates of convergence with second-order information can be much more involved. In this paper, we examine how second-order information is used to speed up extra-gradient methods, even under inexactness. In particular, we show that the proposed methods generate iterates that remain within a bounded set and that the averaged iterates converge to an $ε$-saddle point within $O(ε^{-2/3})$ iterations in terms of a restricted gap function. We also provide a simple routine for solving the subproblem at each iteration, requiring a single Schur decomposition and $O(\log\log(1/ε))$ calls to a linear system solver in a quasi-upper-triangular system. Thus, our method improves the existing line-search-based second-order min-max optimization methods by shaving off an $O(\log\log(1/ε))$ factor in the required number of Schur decompositions. Finally, we conduct experiments on synthetic and real data to demonstrate the efficiency of the proposed methods.

View on arXiv PDF

Similar