What Do Deep Saliency Models Learn about Visual Attention?
This work addresses the lack of interpretability in deep saliency models for researchers and practitioners in computer vision, providing a principled method to analyze and improve these models, though it is incremental as it builds on existing saliency prediction techniques.
The authors tackled the problem of understanding what deep saliency models learn about visual attention by developing an analytic framework that decomposes implicit features into interpretable bases aligned with semantic attributes, and they applied it to analyze various aspects like training data impacts and failure patterns, demonstrating effectiveness in scenarios such as atypical attention in autism.
In recent years, deep saliency models have made significant progress in predicting human visual attention. However, the mechanisms behind their success remain largely unexplained due to the opaque nature of deep neural networks. In this paper, we present a novel analytic framework that sheds light on the implicit features learned by saliency models and provides principled interpretation and quantification of their contributions to saliency prediction. Our approach decomposes these implicit features into interpretable bases that are explicitly aligned with semantic attributes and reformulates saliency prediction as a weighted combination of probability maps connecting the bases and saliency. By applying our framework, we conduct extensive analyses from various perspectives, including the positive and negative weights of semantics, the impact of training data and architectural designs, the progressive influences of fine-tuning, and common failure patterns of state-of-the-art deep saliency models. Additionally, we demonstrate the effectiveness of our framework by exploring visual attention characteristics in various application scenarios, such as the atypical attention of people with autism spectrum disorder, attention to emotion-eliciting stimuli, and attention evolution over time. Our code is publicly available at \url{https://github.com/szzexpoi/saliency_analysis}.