An Embarrassingly Simple Way to Optimize Orthogonal Matrices at Scale

arXiv:2602.14656v12.71 citationsh-index: 5Has Code

Originality Incremental advance

AI Analysis

This work addresses a bottleneck for researchers and practitioners in machine learning who need efficient optimization with orthogonality constraints, representing an incremental improvement over prior methods like Landing.

The paper tackles the problem of optimizing orthogonal matrices at scale, which is computationally expensive with existing methods, and introduces POGO, an algorithm that greatly outperforms recent optimizers by enabling fast, GPU-friendly optimization with thousands of constraints in minutes instead of hours.

Orthogonality constraints are ubiquitous in robust and probabilistic machine learning. Unfortunately, current optimizers are computationally expensive and do not scale to problems with hundreds or thousands of constraints. One notable exception is the Landing algorithm (Ablin et al., 2024) which, however comes at the expense of temporarily relaxing orthogonality. In this work, we revisit and improve on the ideas behind Landing, enabling the inclusion of modern adaptive optimizers while ensuring that orthogonal constraints are effectively met. Remarkably, these improvements come at little to no cost, and reduce the number of required hyperparemeters. Our algorithm POGO is fast and GPU-friendly, consisting of only 5 matrix products, and in practice maintains orthogonality at all times. On several challenging benchmarks, POGO greatly outperforms recent optimizers and shows it can optimize problems with thousands of orthogonal matrices in minutes while alternatives would take hours. As such, POGO sets a milestone to finally exploit orthogonality constraints in ML at scale. A PyTorch implementation of POGO is publicly available at https://github.com/adrianjav/pogo.

View on arXiv PDF Code

Similar