LGAIJan 24, 2022

Theoretical Exploration of Solutions of Feedforward ReLU Networks

arXiv:2202.01919v94 citations
Originality Incremental advance
AI Analysis

This work addresses the theoretical understanding of neural network mechanisms for researchers in machine learning, though it appears incremental as it builds on existing affine-geometry frameworks.

The paper tackles the problem of interpreting feedforward ReLU networks by exploring their solutions for piecewise linear functions, providing universal solutions for three-layer and deep-layer architectures and offering clear interpretations of network components and mechanisms like parameter-sharing and overparameterization.

This paper aims to interpret the mechanism of feedforward ReLU networks by exploring their solutions for piecewise linear functions, through the deduction from basic rules. The constructed solution should be universal enough to explain some network architectures of engineering; in order for that, several ways are provided to enhance the solution universality. Some of the consequences of our theories include: Under affine-geometry background, the solutions of both three-layer networks and deep-layer networks are given, particularly for those architectures applied in practice, such as multilayer feedforward neural networks and decoders; We give clear and intuitive interpretations of each component of network architectures; The parameter-sharing mechanism for multi-outputs is investigated; We provide an explanation of overparameterization solutions in terms of affine transforms; Under our framework, an advantage of deep layers compared to shallower ones is natural to be obtained. Some intermediate results are the basic knowledge for the modeling or understanding of neural networks, such as the classification of data embedded in a higher-dimensional space, the generalization of affine transforms, the probabilistic model of matrix ranks, and the concept of distinguishable data sets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes