LGMay 2, 2016

Some Insights into the Geometry and Training of Neural Networks

arXiv:1605.00329v17 citations
Originality Synthesis-oriented
AI Analysis

This work offers incremental insights into neural network training geometry, potentially aiding researchers in understanding and optimizing training processes.

The paper analyzes neural networks from a feature-space perspective to provide insights into training and classification, showing connections between weight scaling, training sample density, and gradients, which helps explain issues like vanishing gradients and suggests subsampling for performance improvement.

Neural networks have been successfully used for classification tasks in a rapidly growing number of practical applications. Despite their popularity and widespread use, there are still many aspects of training and classification that are not well understood. In this paper we aim to provide some new insights into training and classification by analyzing neural networks from a feature-space perspective. We review and explain the formation of decision regions and study some of their combinatorial aspects. We place a particular emphasis on the connections between the neural network weight and bias terms and properties of decision boundaries and other regions that exhibit varying levels of classification confidence. We show how the error backpropagates in these regions and emphasize the important role they have in the formation of gradients. These findings expose the connections between scaling of the weight parameters and the density of the training samples. This sheds more light on the vanishing gradient problem, explains the need for regularization, and suggests an approach for subsampling training data to improve performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes