ML LG OCMar 29, 2023

Infeasible Deterministic, Stochastic, and Variance-Reduction Algorithms for Optimization under Orthogonality Constraints

Pierre Ablin, Simon Vary, Bin Gao, P. -A. Absil

arXiv:2303.16510v217.923 citationsh-index: 44Has Code

Originality Incremental advance

AI Analysis

This work addresses efficiency issues in machine learning problems with orthogonality constraints, such as PCA and neural network training, by providing faster algorithms, though it is incremental as it builds on prior landing algorithm research.

The paper tackles the computational bottleneck of enforcing orthogonality constraints in optimization by extending the landing algorithm to the Stiefel manifold and developing stochastic and variance-reduction variants, showing they achieve the same convergence rates as exact constraint methods while converging to the manifold.

Orthogonality constraints naturally appear in many machine learning problems, from principal component analysis to robust neural network training. They are usually solved using Riemannian optimization algorithms, which minimize the objective function while enforcing the constraint. However, enforcing the orthogonality constraint can be the most time-consuming operation in such algorithms. Recently, Ablin & Peyré (2022) proposed the landing algorithm, a method with cheap iterations that does not enforce the orthogonality constraints but is attracted towards the manifold in a smooth manner. This article provides new practical and theoretical developments for the landing algorithm. First, the method is extended to the Stiefel manifold, the set of rectangular orthogonal matrices. We also consider stochastic and variance reduction algorithms when the cost function is an average of many functions. We demonstrate that all these methods have the same rate of convergence as their Riemannian counterparts that exactly enforce the constraint, and converge to the manifold. Finally, our experiments demonstrate the promise of our approach to an array of machine-learning problems that involve orthogonality constraints.

View on arXiv PDF Code

Similar