CVSep 14, 2016

Understanding Convolutional Neural Networks with A Mathematical Model

arXiv:1609.04112v2401 citations
AI Analysis

This work provides theoretical insights into CNN design, which is incremental for researchers and practitioners in deep learning.

The paper tackled the problem of understanding why non-linear activation functions are essential in convolutional neural networks and the advantages of multi-layer architectures, proposing a mathematical model called RECOS to explain these aspects and demonstrating it with LeNet-5 and AlexNet on the MNIST dataset.

This work attempts to address two fundamental questions about the structure of the convolutional neural networks (CNN): 1) why a non-linear activation function is essential at the filter output of every convolutional layer? 2) what is the advantage of the two-layer cascade system over the one-layer system? A mathematical model called the "REctified-COrrelations on a Sphere" (RECOS) is proposed to answer these two questions. After the CNN training process, the converged filter weights define a set of anchor vectors in the RECOS model. Anchor vectors represent the frequently occurring patterns (or the spectral components). The necessity of rectification is explained using the RECOS model. Then, the behavior of a two-layer RECOS system is analyzed and compared with its one-layer counterpart. The LeNet-5 and the MNIST dataset are used to illustrate discussion points. Finally, the RECOS model is generalized to a multi-layer system with the AlexNet as an example. Keywords: Convolutional Neural Network (CNN), Nonlinear Activation, RECOS Model, Rectified Linear Unit (ReLU), MNIST Dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes