MLITLGOCPRSTAug 20, 2024

Non-convex matrix sensing: Breaking the quadratic rank barrier in the sample complexity

arXiv:2408.13276v47 citationsh-index: 2
Originality Highly original
AI Analysis

This addresses a key bottleneck in matrix sensing for applications like recommendation systems or imaging, providing a computationally efficient method with theoretical guarantees, though it is incremental in bridging a known gap between convex and non-convex approaches.

The paper tackles the problem of reconstructing a low-rank matrix from linear measurements, showing that non-convex factorized gradient descent achieves sample complexity scaling linearly with rank, matching convex methods and improving from previous quadratic scaling.

For the problem of reconstructing a low-rank matrix from a few linear measurements, two classes of algorithms have been widely studied in the literature: convex approaches based on nuclear norm minimization, and non-convex approaches that use factorized gradient descent. Under certain statistical model assumptions, it is known that nuclear norm minimization recovers the ground truth as soon as the number of samples scales linearly with the number of degrees of freedom of the ground-truth. In contrast, while non-convex approaches are computationally less expensive, existing recovery guarantees assume that the number of samples scales at least quadratically with the rank $r$ of the ground-truth matrix. In this paper, we close this gap by showing that the non-convex approaches can be as efficient as nuclear norm minimization in terms of sample complexity. Namely, we consider the problem of reconstructing a positive semidefinite matrix from a few Gaussian measurements. We show that factorized gradient descent with spectral initialization converges to the ground truth at a linear rate as soon as the number of samples scales with $ Ω(rdκ^2)$, where $d$ is the dimension, and $κ$ is the condition number of the ground truth matrix. This improves the previous rank-dependence in the sample complexity of non-convex matrix factorization from quadratic to linear. Furthermore, we extend our theory to the noisy setting, where we show that with noisy measurements, factorized gradient descent with spectral initialization converges to the minimax optimal error up to a factor linear in $κ$. Our proof relies on a probabilistic decoupling argument, where we show that the gradient descent iterates are only weakly dependent on the individual entries of the measurement matrices. We expect that our proof technique is of independent interest for other non-convex problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes