Practical and Fast Momentum-Based Power Methods
This work addresses a practical bottleneck in machine learning tasks like PCA and clustering by providing faster, more robust algorithms, though it is incremental relative to existing momentum-based methods.
The authors tackled the problem of accelerating the power method for eigenvalue computation without requiring prior spectral knowledge, resulting in two new algorithms (DMPower and DMStream) that achieve near-optimal convergence rates and outperform the vanilla power method in experiments.
The power method is a classical algorithm with broad applications in machine learning tasks, including streaming PCA, spectral clustering, and low-rank matrix approximation. The distilled purpose of the vanilla power method is to determine the largest eigenvalue (in absolute modulus) and its eigenvector of a matrix. A momentum-based scheme can be used to accelerate the power method, but achieving an optimal convergence rate with existing algorithms critically relies on additional spectral information that is unavailable at run-time, and sub-optimal initializations can result in divergence. In this paper, we provide a pair of novel momentum-based power methods, which we call the delayed momentum power method (DMPower) and a streaming variant, the delayed momentum streaming method (DMStream). Our methods leverage inexact deflation and are capable of achieving near-optimal convergence with far less restrictive hyperparameter requirements. We provide convergence analyses for both algorithms through the lens of perturbation theory. Further, we experimentally demonstrate that DMPower routinely outperforms the vanilla power method and that both algorithms match the convergence speed of an oracle running existing accelerated methods with perfect spectral knowledge.