Interpreting Low-level Vision Models with Causal Effect Maps
This work addresses the problem of understanding deep models for researchers and practitioners in low-level vision, offering a diagnostic tool that provides new insights, though it is incremental in applying causality theory to this domain.
The paper tackles the interpretability challenge in deep neural networks for low-level vision tasks by introducing a model- and task-agnostic method called Causal Effect Map (CEM) to visualize and quantify input-output relationships, revealing insights such as that larger receptive fields do not always improve outcomes and global mechanisms may be ineffective in denoising.
Deep neural networks have significantly improved the performance of low-level vision tasks but also increased the difficulty of interpretability. A deep understanding of deep models is beneficial for both network design and practical reliability. To take up this challenge, we introduce causality theory to interpret low-level vision models and propose a model-/task-agnostic method called Causal Effect Map (CEM). With CEM, we can visualize and quantify the input-output relationships on either positive or negative effects. After analyzing various low-level vision tasks with CEM, we have reached several interesting insights, such as: (1) Using more information of input images (e.g., larger receptive field) does NOT always yield positive outcomes. (2) Attempting to incorporate mechanisms with a global receptive field (e.g., channel attention) into image denoising may prove futile. (3) Integrating multiple tasks to train a general model could encourage the network to prioritize local information over global context. Based on the causal effect theory, the proposed diagnostic tool can refresh our common knowledge and bring a deeper understanding of low-level vision models. Codes are available at https://github.com/J-FHu/CEM.