The SVD of Convolutional Weights: A CNN Interpretability Framework
This work provides a new interpretability framework for CNNs, addressing the need to understand model reasoning in image classification, though it appears incremental as it extends SVD methods from fully connected to convolutional layers.
The authors tackled the problem of interpreting convolutional neural networks by defining a singular value decomposition for convolutional weight tensors, which reveals correlations between filters and identifies important features using singular vectors, and they validated this approach with random matrix theory and introduced the DeepDataProfiler library for analysis.
Deep neural networks used for image classification often use convolutional filters to extract distinguishing features before passing them to a linear classifier. Most interpretability literature focuses on providing semantic meaning to convolutional filters to explain a model's reasoning process and confirm its use of relevant information from the input domain. Fully connected layers can be studied by decomposing their weight matrices using a singular value decomposition, in effect studying the correlations between the rows in each matrix to discover the dynamics of the map. In this work we define a singular value decomposition for the weight tensor of a convolutional layer, which provides an analogous understanding of the correlations between filters, exposing the dynamics of the convolutional map. We validate our definition using recent results in random matrix theory. By applying the decomposition across the linear layers of an image classification network we suggest a framework against which interpretability methods might be applied using hypergraphs to model class separation. Rather than looking to the activations to explain the network, we use the singular vectors with the greatest corresponding singular values for each linear layer to identify those features most important to the network. We illustrate our approach with examples and introduce the DeepDataProfiler library, the analysis tool used for this study.