LGDSMLApr 8, 2022

Learning Polynomial Transformations

arXiv:2204.04209v14 citationsh-index: 28
Originality Incremental advance
AI Analysis

This addresses the learnability of deep generative models, specifically two-layer neural networks with polynomial activations, which is crucial for understanding their practical performance, though it is incremental as it focuses on a special case.

The paper tackles the problem of learning high-dimensional polynomial transformations of Gaussians, providing polynomial-time algorithms for quadratic and constant-degree polynomial transformations in a smoothed setting, with extensions to rotation-invariant input distributions.

We consider the problem of learning high dimensional polynomial transformations of Gaussians. Given samples of the form $p(x)$, where $x\sim N(0, \mathrm{Id}_r)$ is hidden and $p: \mathbb{R}^r \to \mathbb{R}^d$ is a function where every output coordinate is a low-degree polynomial, the goal is to learn the distribution over $p(x)$. This problem is natural in its own right, but is also an important special case of learning deep generative models, namely pushforwards of Gaussians under two-layer neural networks with polynomial activations. Understanding the learnability of such generative models is crucial to understanding why they perform so well in practice. Our first main result is a polynomial-time algorithm for learning quadratic transformations of Gaussians in a smoothed setting. Our second main result is a polynomial-time algorithm for learning constant-degree polynomial transformations of Gaussian in a smoothed setting, when the rank of the associated tensors is small. In fact our results extend to any rotation-invariant input distribution, not just Gaussian. These are the first end-to-end guarantees for learning a pushforward under a neural network with more than one layer. Along the way, we also give the first polynomial-time algorithms with provable guarantees for tensor ring decomposition, a popular generalization of tensor decomposition that is used in practice to implicitly store large tensors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes