NEAILGAug 16, 2018

On the Decision Boundary of Deep Neural Networks

arXiv:1808.05385v343 citationsHas Code
Originality Incremental advance
AI Analysis

This work provides insights into deep learning's inner workings, potentially aiding in addressing practical issues like catastrophic forgetting and adversarial attacks, though it is incremental in nature.

The paper investigates the decision boundary of deep neural networks, showing that the last weight layer converges to a linear SVM trained on the last hidden layer's output, both theoretically and empirically, and that full network training improves the bias constant for better generalization.

While deep learning models and techniques have achieved great empirical success, our understanding of the source of success in many aspects remains very limited. In an attempt to bridge the gap, we investigate the decision boundary of a production deep learning architecture with weak assumptions on both the training data and the model. We demonstrate, both theoretically and empirically, that the last weight layer of a neural network converges to a linear SVM trained on the output of the last hidden layer, for both the binary case and the multi-class case with the commonly used cross-entropy loss. Furthermore, we show empirically that training a neural network as a whole, instead of only fine-tuning the last weight layer, may result in better bias constant for the last weight layer, which is important for generalization. In addition to facilitating the understanding of deep learning, our result can be helpful for solving a broad range of practical problems of deep learning, such as catastrophic forgetting and adversarial attacking. The experiment codes are available at https://github.com/lykaust15/NN_decision_boundary

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes