MLLGJun 5, 2018

A Framework for the construction of upper bounds on the number of affine linear regions of ReLU feed-forward neural networks

arXiv:1806.01918v325 citations
Originality Incremental advance
AI Analysis

This work provides a theoretical tool for understanding the expressivity of neural networks, but it is incremental as it builds on and refines existing bounds in the literature.

The paper tackles the problem of deriving upper bounds on the number of affine linear regions in ReLU neural networks by introducing a framework based on inductive analysis and matrix products, which unifies and extends previous bounds, achieving new tighter results for asymptotic cases with infinite layers.

We present a framework to derive upper bounds on the number of regions that feed-forward neural networks with ReLU activation functions are affine linear on. It is based on an inductive analysis that keeps track of the number of such regions per dimensionality of their images within the layers. More precisely, the information about the number regions per dimensionality is pushed through the layers starting with one region of the input dimension of the neural network and using a recursion based on an analysis of how many regions per output dimensionality a subsequent layer with a certain width can induce on an input region with a given dimensionality. The final bound on the number of regions depends on the number and widths of the layers of the neural network and on some additional parameters that were used for the recursion. It is stated in terms of the $L1$-norm of the last column of a product of matrices and provides a unifying treatment of several previously known bounds: Depending on the choice of the recursion parameters that determine these matrices, it is possible to obtain the bounds from Montúfar (2014), (2017) and Serra et. al. (2017) as special cases. For the latter, which is the strongest of these bounds, the formulation in terms of matrices provides new insight. In particular, by using explicit formulas for a Jordan-like decomposition of the involved matrices, we achieve new tighter results for the asymptotic setting, where the number of layers of the same fixed width tends to infinity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes