CVApr 4, 2024

LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity

Walid Bousselham, Angie Boggust, Sofian Chaybouti, Hendrik Strobelt, Hilde Kuehne

MIT

arXiv:2404.03214v219.032 citationsh-index: 9Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of model transparency for users of Vision Transformers in computer vision, though it is an incremental improvement on existing explainability techniques.

The paper tackles the challenge of interpretability in Vision Transformers by proposing LeGrad, an explainability method that computes gradients with respect to attention maps, resulting in superior spatial fidelity and robustness compared to other state-of-the-art methods.

Vision Transformers (ViTs), with their ability to model long-range dependencies through self-attention mechanisms, have become a standard architecture in computer vision. However, the interpretability of these models remains a challenge. To address this, we propose LeGrad, an explainability method specifically designed for ViTs. LeGrad computes the gradient with respect to the attention maps of ViT layers, considering the gradient itself as the explainability signal. We aggregate the signal over all layers, combining the activations of the last as well as intermediate tokens to produce the merged explainability map. This makes LeGrad a conceptually simple and an easy-to-implement tool for enhancing the transparency of ViTs. We evaluate LeGrad in challenging segmentation, perturbation, and open-vocabulary settings, showcasing its versatility compared to other SotA explainability methods demonstrating its superior spatial fidelity and robustness to perturbations. A demo and the code is available at https://github.com/WalBouss/LeGrad.

View on arXiv PDF Code

Similar