ITLGDec 6, 2018

Rank-one matrix estimation: analysis of algorithmic and information theoretic limits by the spatial coupling method

arXiv:1812.02537v124 citations
Originality Incremental advance
AI Analysis

This work addresses fundamental estimation problems in machine learning and statistics, offering a generic proof technique that could apply to many open issues, though it is incremental in extending spatial coupling methods to matrix estimation.

The paper tackles the problem of factorizing low-rank matrices, such as in sparse PCA and community detection, by rigorously deriving the mutual information for symmetric rank-one cases using spatial coupling, revealing a computational gap between known polynomial algorithms and information-theoretic limits that vanishes in the coupled model.

Factorizing low-rank matrices is a problem with many applications in machine learning and statistics, ranging from sparse PCA to community detection and sub-matrix localization. For probabilistic models in the Bayes optimal setting, general expressions for the mutual information have been proposed using powerful heuristic statistical physics computations via the replica and cavity methods, and proven in few specific cases by a variety of methods. Here, we use the spatial coupling methodology developed in the framework of error correcting codes, to rigorously derive the mutual information for the symmetric rank-one case. We characterize the detectability phase transitions in a large set of estimation problems, where we show that there exists a gap between what currently known polynomial algorithms (in particular spectral methods and approximate message-passing) can do and what is expected information theoretically. Moreover, we show that the computational gap vanishes for the proposed spatially coupled model, a promising feature with many possible applications. Our proof technique has an interest on its own and exploits three essential ingredients: the interpolation method first introduced in statistical physics, the analysis of approximate message-passing algorithms first introduced in compressive sensing, and the theory of threshold saturation for spatially coupled systems first developed in coding theory. Our approach is very generic and can be applied to many other open problems in statistical estimation where heuristic statistical physics predictions are available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes