CVAISep 18, 2025

[Re] Improving Interpretation Faithfulness for Vision Transformers

arXiv:2509.14846v11 citationsTrans. Mach. Learn. Res.
Originality Synthesis-oriented
AI Analysis

This is an incremental reproduction study that verifies claims about improving interpretability faithfulness for vision transformers, relevant for researchers in explainable AI.

This work reproduces and extends a study on Faithful Vision Transformers (FViTs), finding that Diffusion Denoised Smoothing (DDS) broadly improves interpretability robustness to attacks in segmentation and classification tasks, with minor discrepancies noted.

This work aims to reproduce the results of Faithful Vision Transformers (FViTs) proposed by arXiv:2311.17983 alongside interpretability methods for Vision Transformers from arXiv:2012.09838 and Xu (2022) et al. We investigate claims made by arXiv:2311.17983, namely that the usage of Diffusion Denoised Smoothing (DDS) improves interpretability robustness to (1) attacks in a segmentation task and (2) perturbation and attacks in a classification task. We also extend the original study by investigating the authors' claims that adding DDS to any interpretability method can improve its robustness under attack. This is tested on baseline methods and the recently proposed Attribution Rollout method. In addition, we measure the computational costs and environmental impact of obtaining an FViT through DDS. Our results broadly agree with the original study's findings, although minor discrepancies were found and discussed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes