Local Linear Convergence of Infeasible Optimization with Orthogonal Constraints
This work addresses a computational bottleneck in machine learning algorithms like PCA and deep neural network training by providing a more efficient optimization method, though it is incremental as it builds on the existing landing algorithm.
The paper tackles the problem of solving optimization tasks under orthogonality constraints, which are common in machine learning, by analyzing the landing algorithm and establishing a novel linear convergence rate for smooth non-convex functions, with numerical experiments showing it performs on par with state-of-the-art methods while reducing computational overhead.
Many classical and modern machine learning algorithms require solving optimization tasks under orthogonality constraints. Solving these tasks with feasible methods requires a gradient descent update followed by a retraction operation on the Stiefel manifold, which can be computationally expensive. Recently, an infeasible retraction-free approach, termed the landing algorithm, was proposed as an efficient alternative. Motivated by the common occurrence of orthogonality constraints in tasks such as principle component analysis and training of deep neural networks, this paper studies the landing algorithm and establishes a novel linear convergence rate for smooth non-convex functions using only a local Riemannian PŁ condition. Numerical experiments demonstrate that the landing algorithm performs on par with the state-of-the-art retraction-based methods with substantially reduced computational overhead.