ML LGJun 14, 2022

Overparametrized linear dimensionality reductions: From projection pursuit to two-layer neural networks

arXiv:2206.06526v26.73 citationsh-index: 80

Originality Incremental advance

AI Analysis

This work addresses fundamental theoretical questions in high-dimensional statistics and machine learning, with implications for methods like projection pursuit and neural networks, though it appears incremental as it builds on existing asymptotic analyses.

The paper tackles the problem of characterizing the set of probability distributions arising from low-dimensional projections of high-dimensional Gaussian data in the large-sample, high-dimensional limit, establishing new inner and outer bounds, including exact results for one-dimensional projections and sharp bounds in terms of metrics like Wasserstein distance. It applies these findings to unsupervised learning methods and extends them to supervised learning, deriving an upper bound on the interpolation threshold for two-layer neural networks with a fixed number of hidden neurons.

Given a cloud of $n$ data points in $\mathbb{R}^d$, consider all projections onto $m$-dimensional subspaces of $\mathbb{R}^d$ and, for each such projection, the empirical distribution of the projected points. What does this collection of probability distributions look like when $n,d$ grow large? We consider this question under the null model in which the points are i.i.d. standard Gaussian vectors, focusing on the asymptotic regime in which $n,d\to\infty$, with $n/d\toα\in (0,\infty)$, while $m$ is fixed. Denoting by $\mathscr{F}_{m, α}$ the set of probability distributions in $\mathbb{R}^m$ that arise as low-dimensional projections in this limit, we establish new inner and outer bounds on $\mathscr{F}_{m, α}$. In particular, we characterize the Wasserstein radius of $\mathscr{F}_{m,α}$ up to constant multiplicative factors, and determine it exactly for $m=1$. We also prove sharp bounds in terms of Kullback-Leibler divergence and Rényi information dimension. The previous question has application to unsupervised learning methods, such as projection pursuit and independent component analysis. We introduce a version of the same problem that is relevant for supervised learning, and prove a sharp Wasserstein radius bound. As an application, we establish an upper bound on the interpolation threshold of two-layers neural networks with $m$ hidden neurons.

View on arXiv PDF

Similar