LGOCMLOct 27, 2022

Learning Single-Index Models with Shallow Neural Networks

arXiv:2210.15651v1110 citationsh-index: 48
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficiently learning high-dimensional data with low-dimensional structure, providing a neural network-based approach that is competitive with tailored algorithms, though it is incremental as it builds on existing methods.

The paper tackles the problem of learning single-index models, which are functions with low-dimensional structure, using shallow neural networks with frozen biases. They show that gradient flow on these networks leads to a benign optimization landscape and achieves generalization guarantees matching near-optimal sample complexity of specialized methods.

Single-index models are a class of functions given by an unknown univariate ``link'' function applied to an unknown one-dimensional projection of the input. These models are particularly relevant in high dimension, when the data might present low-dimensional structure that learning algorithms should adapt to. While several statistical aspects of this model, such as the sample complexity of recovering the relevant (one-dimensional) subspace, are well-understood, they rely on tailored algorithms that exploit the specific structure of the target function. In this work, we introduce a natural class of shallow neural networks and study its ability to learn single-index models via gradient flow. More precisely, we consider shallow networks in which biases of the neurons are frozen at random initialization. We show that the corresponding optimization landscape is benign, which in turn leads to generalization guarantees that match the near-optimal sample complexity of dedicated semi-parametric methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes