Linear Regression with an Unknown Permutation: Statistical and Computational Limits
This addresses a fundamental challenge in statistical inference with permutations, providing theoretical limits and computational insights, though it is incremental in extending known permutation recovery frameworks.
The paper tackles the problem of recovering an unknown permutation in a noisy linear observation model, establishing sharp conditions on SNR, sample size, and dimension for exact and approximate recovery, and showing that maximum likelihood estimation is NP-hard but polynomial-time solvable for dimension one.
Consider a noisy linear observation model with an unknown permutation, based on observing $y = Π^* A x^* + w$, where $x^* \in \mathbb{R}^d$ is an unknown vector, $Π^*$ is an unknown $n \times n$ permutation matrix, and $w \in \mathbb{R}^n$ is additive Gaussian noise. We analyze the problem of permutation recovery in a random design setting in which the entries of the matrix $A$ are drawn i.i.d. from a standard Gaussian distribution, and establish sharp conditions on the SNR, sample size $n$, and dimension $d$ under which $Π^*$ is exactly and approximately recoverable. On the computational front, we show that the maximum likelihood estimate of $Π^*$ is NP-hard to compute, while also providing a polynomial time algorithm when $d =1$.