Explaining Deep Learning Models using Causal Inference
This work addresses the need for trust and interpretability in deep learning for commercial applications, though it appears incremental by applying causal inference to an existing challenge.
The paper tackles the problem of explaining deep learning models by proposing a causal inference framework to reason over CNN architectures, resulting in a method to quantitatively rank convolution filters based on counterfactual importance, as demonstrated on models like LeNet5, VGG19, and ResNet32.
Although deep learning models have been successfully applied to a variety of tasks, due to the millions of parameters, they are becoming increasingly opaque and complex. In order to establish trust for their widespread commercial use, it is important to formalize a principled framework to reason over these models. In this work, we use ideas from causal inference to describe a general framework to reason over CNN models. Specifically, we build a Structural Causal Model (SCM) as an abstraction over a specific aspect of the CNN. We also formulate a method to quantitatively rank the filters of a convolution layer according to their counterfactual importance. We illustrate our approach with popular CNN architectures such as LeNet5, VGG19, and ResNet32.