A Singular Value Perspective on Model Robustness
This work provides insights into the generalization behavior and adversarial robustness of CNNs, which is a problem for researchers and practitioners aiming to build more reliable AI systems.
This paper explores the relationship between CNN generalization and the Singular Value Decomposition (SVD) of images, showing that naturally trained and adversarially robust CNNs exploit different features. These features can be disentangled by SVD for ImageNet and CIFAR-10 trained networks.
Convolutional Neural Networks (CNNs) have made significant progress on several computer vision benchmarks, but are fraught with numerous non-human biases such as vulnerability to adversarial samples. Their lack of explainability makes identification and rectification of these biases difficult, and understanding their generalization behavior remains an open problem. In this work we explore the relationship between the generalization behavior of CNNs and the Singular Value Decomposition (SVD) of images. We show that naturally trained and adversarially robust CNNs exploit highly different features for the same dataset. We demonstrate that these features can be disentangled by SVD for ImageNet and CIFAR-10 trained networks. Finally, we propose Rank Integrated Gradients (RIG), the first rank-based feature attribution method to understand the dependence of CNNs on image rank.