On Landscape of Lagrangian Functions and Stochastic Search for Constrained Nonconvex Optimization
This addresses optimization challenges in machine learning and signal processing, offering theoretical insights and a practical algorithm for a specific class of problems, but it is incremental as it builds on existing Lagrangian methods.
The paper tackles constrained nonconvex optimization by analyzing the landscape of Lagrangian functions, showing that stable equilibria correspond to global optima for problems like generalized eigenvalue (GEV), and proposes a stochastic primal-dual algorithm with asymptotic convergence rate and first sample complexity result for online GEV.
We study constrained nonconvex optimization problems in machine learning, signal processing, and stochastic control. It is well-known that these problems can be rewritten to a minimax problem in a Lagrangian form. However, due to the lack of convexity, their landscape is not well understood and how to find the stable equilibria of the Lagrangian function is still unknown. To bridge the gap, we study the landscape of the Lagrangian function. Further, we define a special class of Lagrangian functions. They enjoy two properties: 1.Equilibria are either stable or unstable (Formal definition in Section 2); 2.Stable equilibria correspond to the global optima of the original problem. We show that a generalized eigenvalue (GEV) problem, including canonical correlation analysis and other problems, belongs to the class. Specifically, we characterize its stable and unstable equilibria by leveraging an invariant group and symmetric property (more details in Section 3). Motivated by these neat geometric structures, we propose a simple, efficient, and stochastic primal-dual algorithm solving the online GEV problem. Theoretically, we provide sufficient conditions, based on which we establish an asymptotic convergence rate and obtain the first sample complexity result for the online GEV problem by diffusion approximations, which are widely used in applied probability and stochastic control. Numerical results are provided to support our theory.