LGCVMLFeb 15, 2020

Hold me tight! Influence of discriminative features on deep network boundaries

arXiv:2002.06349v453 citations
AI Analysis

This work provides insights into neural network explainability and robustness, particularly for computer vision applications, though it is incremental in building on adversarial robustness tools.

The study investigates how discriminative features influence the decision boundaries of deep neural networks, revealing that networks are highly invariant to non-discriminative features and that boundaries are sensitive to small training perturbations, with changes in certain directions causing sudden invariances in orthogonal ones.

Important insights towards the explainability of neural networks reside in the characteristics of their decision boundaries. In this work, we borrow tools from the field of adversarial robustness, and propose a new perspective that relates dataset features to the distance of samples to the decision boundary. This enables us to carefully tweak the position of the training samples and measure the induced changes on the boundaries of CNNs trained on large-scale vision datasets. We use this framework to reveal some intriguing properties of CNNs. Specifically, we rigorously confirm that neural networks exhibit a high invariance to non-discriminative features, and show that the decision boundaries of a DNN can only exist as long as the classifier is trained with some features that hold them together. Finally, we show that the construction of the decision boundary is extremely sensitive to small perturbations of the training samples, and that changes in certain directions can lead to sudden invariances in the orthogonal ones. This is precisely the mechanism that adversarial training uses to achieve robustness.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes