DS LG NA OCMay 9, 2024

Faster Linear Systems and Matrix Norm Approximation via Multi-level Sketched Preconditioning

Michał Dereziński, Christopher Musco, Jiaming Yang

arXiv:2405.05865v25.19 citationsSODA

Originality Highly original

AI Analysis

This provides faster algorithms for fundamental linear algebraic problems in computational mathematics and machine learning, such as Gaussian process regression, with improvements over recent state-of-the-art methods.

The paper tackles solving linear systems and approximating matrix norms by introducing a new class of preconditioned iterative methods based on multi-level sketched preconditioning, resulting in faster runtimes such as solving certain linear systems in ~O(n^2.065 + k^ω) time and approximating the nuclear norm in ~O(n^2.11) time.

We present a new class of preconditioned iterative methods for solving linear systems of the form $Ax = b$. Our methods are based on constructing a low-rank Nyström approximation to $A$ using sparse random matrix sketching. This approximation is used to construct a preconditioner, which itself is inverted quickly using additional levels of random sketching and preconditioning. We prove that the convergence of our methods depends on a natural average condition number of $A$, which improves as the rank of the Nyström approximation increases. Concretely, this allows us to obtain faster runtimes for a number of fundamental linear algebraic problems: 1. We show how to solve any $n\times n$ linear system that is well-conditioned except for $k$ outlying large singular values in $\tilde{O}(n^{2.065} + k^ω)$ time, improving on a recent result of [Dereziński, Yang, STOC 2024] for all $k \gtrsim n^{0.78}$. 2. We give the first $\tilde{O}(n^2 + {d_λ}^ω$) time algorithm for solving a regularized linear system $(A + λI)x = b$, where $A$ is positive semidefinite with effective dimension $d_λ=\mathrm{tr}(A(A+λI)^{-1})$. This problem arises in applications like Gaussian process regression. 3. We give faster algorithms for approximating Schatten $p$-norms and other matrix norms. For example, for the Schatten 1-norm (nuclear norm), we give an algorithm that runs in $\tilde{O}(n^{2.11})$ time, improving on an $\tilde{O}(n^{2.18})$ method of [Musco et al., ITCS 2018]. All results are proven in the real RAM model of computation. Interestingly, previous state-of-the-art algorithms for most of the problems above relied on stochastic iterative methods, like stochastic coordinate and gradient descent. Our work takes a completely different approach, instead leveraging tools from matrix sketching.

View on arXiv PDF

Similar