Sanity Checks for Saliency Methods Explaining Object Detectors
This work addresses the reliability of explanation methods for object detection models, which is crucial for users in computer vision and AI safety, though it is incremental as it builds on prior classification-based evaluations.
The study extended sanity checks for saliency methods from classification to object detectors, finding that interpretability depends more on the model than the explanation method, with EfficientDet-D0 performing best across tests.
Saliency methods are frequently used to explain Deep Neural Network-based models. Adebayo et al.'s work on evaluating saliency methods for classification models illustrate certain explanation methods fail the model and data randomization tests. However, on extending the tests for various state of the art object detectors we illustrate that the ability to explain a model is more dependent on the model itself than the explanation method. We perform sanity checks for object detection and define new qualitative criteria to evaluate the saliency explanations, both for object classification and bounding box decisions, using Guided Backpropagation, Integrated Gradients, and their Smoothgrad versions, together with Faster R-CNN, SSD, and EfficientDet-D0, trained on COCO. In addition, the sensitivity of the explanation method to model parameters and data labels varies class-wise motivating to perform the sanity checks for each class. We find that EfficientDet-D0 is the most interpretable method independent of the saliency method, which passes the sanity checks with little problems.