STMLFeb 6, 2015

Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix

arXiv:1502.01988v265 citations
Originality Incremental advance
AI Analysis

This addresses a fundamental problem in high-dimensional inference for researchers in statistics and machine learning, providing theoretical insights into computational-statistical trade-offs.

The paper tackles the problem of submatrix localization in noisy matrices by establishing computational and statistical thresholds for signal-to-noise ratios, showing that the computational threshold is significantly higher than the statistical one, indicating a gap between optimality and efficiency.

The interplay between computational efficiency and statistical accuracy in high-dimensional inference has drawn increasing attention in the literature. In this paper, we study computational and statistical boundaries for submatrix localization. Given one observation of (one or multiple non-overlapping) signal submatrix (of magnitude $λ$ and size $k_m \times k_n$) contaminated with a noise matrix (of size $m \times n$), we establish two transition thresholds for the signal to noise $λ/σ$ ratio in terms of $m$, $n$, $k_m$, and $k_n$. The first threshold, $\sf SNR_c$, corresponds to the computational boundary. Below this threshold, it is shown that no polynomial time algorithm can succeed in identifying the submatrix, under the \textit{hidden clique hypothesis}. We introduce adaptive linear time spectral algorithms that identify the submatrix with high probability when the signal strength is above the threshold $\sf SNR_c$. The second threshold, $\sf SNR_s$, captures the statistical boundary, below which no method can succeed with probability going to one in the minimax sense. The exhaustive search method successfully finds the submatrix above this threshold. The results show an interesting phenomenon that $\sf SNR_c$ is always significantly larger than $\sf SNR_s$, which implies an essential gap between statistical optimality and computational efficiency for submatrix localization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes