SQ Lower Bounds for Non-Gaussian Component Analysis with Weaker Assumptions
This work addresses theoretical limitations in understanding the computational hardness of NGCA for researchers in statistical learning and complexity theory, providing improved lower bounds for concrete estimation tasks.
The paper tackles the problem of proving Statistical Query (SQ) lower bounds for Non-Gaussian Component Analysis (NGCA) by showing that a previously required chi-squared norm condition is unnecessary, establishing near-optimal SQ lower bounds under only a moment-matching condition and generalizing to hidden subspaces.
We study the complexity of Non-Gaussian Component Analysis (NGCA) in the Statistical Query (SQ) model. Prior work developed a general methodology to prove SQ lower bounds for this task that have been applicable to a wide range of contexts. In particular, it was known that for any univariate distribution $A$ satisfying certain conditions, distinguishing between a standard multivariate Gaussian and a distribution that behaves like $A$ in a random hidden direction and like a standard Gaussian in the orthogonal complement, is SQ-hard. The required conditions were that (1) $A$ matches many low-order moments with the standard univariate Gaussian, and (2) the chi-squared norm of $A$ with respect to the standard Gaussian is finite. While the moment-matching condition is necessary for hardness, the chi-squared condition was only required for technical reasons. In this work, we establish that the latter condition is indeed not necessary. In particular, we prove near-optimal SQ lower bounds for NGCA under the moment-matching condition only. Our result naturally generalizes to the setting of a hidden subspace. Leveraging our general SQ lower bound, we obtain near-optimal SQ lower bounds for a range of concrete estimation tasks where existing techniques provide sub-optimal or even vacuous guarantees.