PR IT ST MLDec 22, 2016

Statistical limits of spiked tensor models

Amelia Perry, Alexander S. Wein, Afonso S. Bandeira

arXiv:1612.07728v218.382 citations

Originality Incremental advance

AI Analysis

This work addresses fundamental statistical limits in spiked tensor models, which is important for theoretical machine learning and signal processing, though it is incremental in refining existing bounds.

The paper tackles the problem of detecting and estimating rank-one deformations in symmetric random Gaussian tensors, establishing tight upper and lower bounds on the critical signal-to-noise ratio for various priors, with bounds matching up to a 1+o(1) factor as tensor order increases and closing previous gaps by a √2 factor.

We study the statistical limits of both detecting and estimating a rank-one deformation of a symmetric random Gaussian tensor. We establish upper and lower bounds on the critical signal-to-noise ratio, under a variety of priors for the planted vector: (i) a uniformly sampled unit vector, (ii) i.i.d. $\pm 1$ entries, and (iii) a sparse vector where a constant fraction $ρ$ of entries are i.i.d. $\pm 1$ and the rest are zero. For each of these cases, our upper and lower bounds match up to a $1+o(1)$ factor as the order $d$ of the tensor becomes large. For sparse signals (iii), our bounds are also asymptotically tight in the sparse limit $ρ\to 0$ for any fixed $d$ (including the $d=2$ case of sparse PCA). Our upper bounds for (i) demonstrate a phenomenon reminiscent of the work of Baik, Ben Arous and Péché: an `eigenvalue' of a perturbed tensor emerges from the bulk at a strictly lower signal-to-noise ratio than when the perturbation itself exceeds the bulk; we quantify the size of this effect. We also provide some general results for larger classes of priors. In particular, the large $d$ asymptotics of the threshold location differs between problems with discrete priors versus continuous priors. Finally, for priors (i) and (ii) we carry out the replica prediction from statistical physics, which is conjectured to give the exact information-theoretic threshold for any fixed $d$. Of independent interest, we introduce a new improvement to the second moment method for contiguity, on which our lower bounds are based. Our technique conditions away from rare `bad' events that depend on interactions between the signal and noise. This enables us to close $\sqrt{2}$-factor gaps present in several previous works.

View on arXiv PDF

Similar