Visualizing Deep Neural Network Decisions: Prediction Difference Analysis
This work addresses the interpretability of black-box classifiers, which is crucial for improving models and enabling adoption in critical domains like medicine.
The authors tackled the problem of interpreting deep neural network decisions by introducing prediction difference analysis, a method that highlights image regions influencing classification, and demonstrated its effectiveness on both natural and medical images.
This article presents the prediction difference analysis method for visualizing the response of a deep neural network to a specific input. When classifying images, the method highlights areas in a given input image that provide evidence for or against a certain class. It overcomes several shortcoming of previous methods and provides great additional insight into the decision making process of classifiers. Making neural network decisions interpretable through visualization is important both to improve models and to accelerate the adoption of black-box classifiers in application areas such as medicine. We illustrate the method in experiments on natural images (ImageNet data), as well as medical images (MRI brain scans).