NEDIS-NNLGMATH-PHOct 28, 2019

Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes

arXiv:1910.12478v3239 citationsHas Code
Originality Highly original
AI Analysis

This work establishes a foundational theoretical link between neural networks and Gaussian processes for a wide class of architectures, potentially impacting researchers in machine learning theory and practitioners interested in Bayesian deep learning.

The paper demonstrates that wide neural networks with random weights and biases, across a broad range of modern architectures including feedforward, recurrent, convolutional, and attention-based networks, are Gaussian processes, extending prior results from specific cases to all such expressible networks. It provides open-source implementations for Gaussian Process kernels of various architectures, such as RNNs, GRUs, transformers, and batch normalization with ReLU networks.

Wide neural networks with random weights and biases are Gaussian processes, as originally observed by Neal (1995) and more recently by Lee et al. (2018) and Matthews et al. (2018) for deep fully-connected networks, as well as by Novak et al. (2019) and Garriga-Alonso et al. (2019) for deep convolutional networks. We show that this Neural Network-Gaussian Process correspondence surprisingly extends to all modern feedforward or recurrent neural networks composed of multilayer perceptron, RNNs (e.g. LSTMs, GRUs), (nD or graph) convolution, pooling, skip connection, attention, batch normalization, and/or layer normalization. More generally, we introduce a language for expressing neural network computations, and our result encompasses all such expressible neural networks. This work serves as a tutorial on the *tensor programs* technique formulated in Yang (2019) and elucidates the Gaussian Process results obtained there. We provide open-source implementations of the Gaussian Process kernels of simple RNN, GRU, transformer, and batchnorm+ReLU network at github.com/thegregyang/GP4A.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes