SS-CAM: Smoothed Score-CAM for Sharper Visual Feature Localization
This work addresses the need for more precise interpretability in deep learning models, particularly for high-risk applications, though it is incremental as it builds directly on an existing method.
The paper tackles the problem of improving visual feature localization in deep convolutional neural networks by introducing SS-CAM, an enhanced method based on Score-CAM that uses a smooth operation to produce sharper and more centralized object feature explanations, achieving better performance on faithfulness and localization tasks on the ILSVRC 2012 Validation dataset.
Interpretation of the underlying mechanisms of Deep Convolutional Neural Networks has become an important aspect of research in the field of deep learning due to their applications in high-risk environments. To explain these black-box architectures there have been many methods applied so the internal decisions can be analyzed and understood. In this paper, built on the top of Score-CAM, we introduce an enhanced visual explanation in terms of visual sharpness called SS-CAM, which produces centralized localization of object features within an image through a smooth operation. We evaluate our method on the ILSVRC 2012 Validation dataset, which outperforms Score-CAM on both faithfulness and localization tasks.