LGDSOCSTAug 7, 2023

Matrix Completion in Almost-Verification Time

arXiv:2308.03661v16 citationsh-index: 30
Originality Highly original
AI Analysis

This work addresses the fundamental problem of efficiently recovering low-rank matrices from random observations, with significant improvements in computational and sample efficiency for applications in data analysis and machine learning, though it is incremental in building upon prior methods.

The paper tackles low-rank matrix completion by introducing a new framework that achieves high-precision completion with improved sample and time complexities, reducing from prior state-of-the-art of ≈mr⁵ samples and ≈mr⁷ time to mr²⁺ᵒ⁽¹⁾ samples and mr³⁺ᵒ⁽¹⁾ time under incoherent spans, and further to almost optimal mr¹⁺ᵒ⁽¹⁾ samples and mr²⁺ᵒ⁽¹⁾ time under a new regularity assumption.

We give a new framework for solving the fundamental problem of low-rank matrix completion, i.e., approximating a rank-$r$ matrix $\mathbf{M} \in \mathbb{R}^{m \times n}$ (where $m \ge n$) from random observations. First, we provide an algorithm which completes $\mathbf{M}$ on $99\%$ of rows and columns under no further assumptions on $\mathbf{M}$ from $\approx mr$ samples and using $\approx mr^2$ time. Then, assuming the row and column spans of $\mathbf{M}$ satisfy additional regularity properties, we show how to boost this partial completion guarantee to a full matrix completion algorithm by aggregating solutions to regression problems involving the observations. In the well-studied setting where $\mathbf{M}$ has incoherent row and column spans, our algorithms complete $\mathbf{M}$ to high precision from $mr^{2+o(1)}$ observations in $mr^{3 + o(1)}$ time (omitting logarithmic factors in problem parameters), improving upon the prior state-of-the-art [JN15] which used $\approx mr^5$ samples and $\approx mr^7$ time. Under an assumption on the row and column spans of $\mathbf{M}$ we introduce (which is satisfied by random subspaces with high probability), our sample complexity improves to an almost information-theoretically optimal $mr^{1 + o(1)}$, and our runtime improves to $mr^{2 + o(1)}$. Our runtimes have the appealing property of matching the best known runtime to verify that a rank-$r$ decomposition $\mathbf{U}\mathbf{V}^\top$ agrees with the sampled observations. We also provide robust variants of our algorithms that, given random observations from $\mathbf{M} + \mathbf{N}$ with $\|\mathbf{N}\|_{F} \le Δ$, complete $\mathbf{M}$ to Frobenius norm distance $\approx r^{1.5}Δ$ in the same runtimes as the noiseless setting. Prior noisy matrix completion algorithms [CP10] only guaranteed a distance of $\approx \sqrt{n}Δ$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes