LG DS MLApr 8, 2022

Learning Polynomial Transformations

Sitan Chen, Jerry Li, Yuanzhi Li, Anru R. Zhang

arXiv:2204.04209v15.84 citationsh-index: 28

Originality Incremental advance

AI Analysis

This addresses the learnability of deep generative models, specifically two-layer neural networks with polynomial activations, which is crucial for understanding their practical performance, though it is incremental as it focuses on a special case.

The paper tackles the problem of learning high-dimensional polynomial transformations of Gaussians, providing polynomial-time algorithms for quadratic and constant-degree polynomial transformations in a smoothed setting, with extensions to rotation-invariant input distributions.

We consider the problem of learning high dimensional polynomial transformations of Gaussians. Given samples of the form $p(x)$, where $x\sim N(0, \mathrm{Id}_r)$ is hidden and $p: \mathbb{R}^r \to \mathbb{R}^d$ is a function where every output coordinate is a low-degree polynomial, the goal is to learn the distribution over $p(x)$. This problem is natural in its own right, but is also an important special case of learning deep generative models, namely pushforwards of Gaussians under two-layer neural networks with polynomial activations. Understanding the learnability of such generative models is crucial to understanding why they perform so well in practice. Our first main result is a polynomial-time algorithm for learning quadratic transformations of Gaussians in a smoothed setting. Our second main result is a polynomial-time algorithm for learning constant-degree polynomial transformations of Gaussian in a smoothed setting, when the rank of the associated tensors is small. In fact our results extend to any rotation-invariant input distribution, not just Gaussian. These are the first end-to-end guarantees for learning a pushforward under a neural network with more than one layer. Along the way, we also give the first polynomial-time algorithms with provable guarantees for tensor ring decomposition, a popular generalization of tensor decomposition that is used in practice to implicitly store large tensors.

View on arXiv PDF

Similar