Assessing the validity of saliency maps for abnormality localization in medical imaging
This work addresses the lack of quantified evaluation for saliency maps in medical imaging, which is crucial for clinicians relying on interpretability in diagnosis.
The study assessed the validity of saliency maps for localizing abnormalities in medical imaging using the RSNA Pneumonia dataset, finding that GradCAM was most sensitive to model parameter and label randomization and highly agnostic to model architecture.
Saliency maps have become a widely used method to assess which areas of the input image are most pertinent to the prediction of a trained neural network. However, in the context of medical imaging, there is no study to our knowledge that has examined the efficacy of these techniques and quantified them using overlap with ground truth bounding boxes. In this work, we explored the credibility of the various existing saliency map methods on the RSNA Pneumonia dataset. We found that GradCAM was the most sensitive to model parameter and label randomization, and was highly agnostic to model architecture.