Linear-Sample Learning of Low-Rank Distributions
This addresses a fundamental gap in sample efficiency for applications like community detection and collaborative filtering, though it is incremental as it builds on existing spectral techniques.
The paper tackles the problem of determining the sample complexity for learning low-rank matrices in latent-variable models, showing that Ω(kr/ε²) samples are necessary and providing an algorithm that uses O(kr/ε² log²(r/ε)) samples, which is linear in dimension and nearly linear in rank.
Many latent-variable applications, including community detection, collaborative filtering, genomic analysis, and NLP, model data as generated by low-rank matrices. Yet despite considerable research, except for very special cases, the number of samples required to efficiently recover the underlying matrices has not been known. We determine the onset of learning in several common latent-variable settings. For all of them, we show that learning $k\times k$, rank-$r$, matrices to normalized $L_{1}$ distance $ε$ requires $Ω(\frac{kr}{ε^2})$ samples, and propose an algorithm that uses ${\cal O}(\frac{kr}{ε^2}\log^2\frac rε)$ samples, a number linear in the high dimension, and nearly linear in the, typically low, rank. The algorithm improves on existing spectral techniques and runs in polynomial time. The proofs establish new results on the rapid convergence of the spectral distance between the model and observation matrices, and may be of independent interest.