Grid Saliency for Context Explanations of Semantic Segmentation
This work addresses the problem of providing visual explanations for semantic segmentation models, which is incremental as it extends existing saliency methods to a new task.
The paper tackles the limitation of existing saliency methods to image classification by extending them to generate grid saliencies for dense prediction networks, specifically semantic segmentation, enabling spatially coherent visual explanations. The results show that grid saliency successfully provides interpretable context explanations and can detect contextual biases, with effectiveness demonstrated on synthetic and real-world datasets like Cityscapes using state-of-the-art networks.
Recently, there has been a growing interest in developing saliency methods that provide visual explanations of network predictions. Still, the usability of existing methods is limited to image classification models. To overcome this limitation, we extend the existing approaches to generate grid saliencies, which provide spatially coherent visual explanations for (pixel-level) dense prediction networks. As the proposed grid saliency allows to spatially disentangle the object and its context, we specifically explore its potential to produce context explanations for semantic segmentation networks, discovering which context most influences the class predictions inside a target object area. We investigate the effectiveness of grid saliency on a synthetic dataset with an artificially induced bias between objects and their context as well as on the real-world Cityscapes dataset using state-of-the-art segmentation networks. Our results show that grid saliency can be successfully used to provide easily interpretable context explanations and, moreover, can be employed for detecting and localizing contextual biases present in the data.