MLLGMay 7, 2021

What Kinds of Functions do Deep Neural Networks Learn? Insights from Variational Spline Theory

arXiv:2105.03361v380 citations
Originality Incremental advance
AI Analysis

This work offers theoretical insights into deep learning for researchers, but it is incremental as it builds on existing variational and spline theory without demonstrating practical performance gains.

The authors tackled the problem of understanding the functions learned by deep ReLU neural networks by developing a variational framework that defines a new function space capturing compositional structure, showing these networks solve regularized data fitting problems and providing theoretical support for skip connections and rank-bounded weights.

We develop a variational framework to understand the properties of functions learned by fitting deep neural networks with rectified linear unit activations to data. We propose a new function space, which is reminiscent of classical bounded variation-type spaces, that captures the compositional structure associated with deep neural networks. We derive a representer theorem showing that deep ReLU networks are solutions to regularized data fitting problems over functions from this space. The function space consists of compositions of functions from the Banach spaces of second-order bounded variation in the Radon domain. These are Banach spaces with sparsity-promoting norms, giving insight into the role of sparsity in deep neural networks. The neural network solutions have skip connections and rank bounded weight matrices, providing new theoretical support for these common architectural choices. The variational problem we study can be recast as a finite-dimensional neural network training problem with regularization schemes related to the notions of weight decay and path-norm regularization. Finally, our analysis builds on techniques from variational spline theory, providing new connections between deep neural networks and splines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes