NE AI LGAug 16, 2018

On the Decision Boundary of Deep Neural Networks

arXiv:1808.05385v320.043 citationsHas Code

Originality Incremental advance

AI Analysis

This work provides insights into deep learning's inner workings, potentially aiding in addressing practical issues like catastrophic forgetting and adversarial attacks, though it is incremental in nature.

The paper investigates the decision boundary of deep neural networks, showing that the last weight layer converges to a linear SVM trained on the last hidden layer's output, both theoretically and empirically, and that full network training improves the bias constant for better generalization.

While deep learning models and techniques have achieved great empirical success, our understanding of the source of success in many aspects remains very limited. In an attempt to bridge the gap, we investigate the decision boundary of a production deep learning architecture with weak assumptions on both the training data and the model. We demonstrate, both theoretically and empirically, that the last weight layer of a neural network converges to a linear SVM trained on the output of the last hidden layer, for both the binary case and the multi-class case with the commonly used cross-entropy loss. Furthermore, we show empirically that training a neural network as a whole, instead of only fine-tuning the last weight layer, may result in better bias constant for the last weight layer, which is important for generalization. In addition to facilitating the understanding of deep learning, our result can be helpful for solving a broad range of practical problems of deep learning, such as catastrophic forgetting and adversarial attacking. The experiment codes are available at https://github.com/lykaust15/NN_decision_boundary

View on arXiv PDF Code

Similar