Interpretable Vision Transformers in Image Classification via SVDA
This work addresses interpretability issues in computer vision models for researchers and practitioners, but it is incremental as it adapts an existing method to a new architecture.
The paper tackled the problem of opaque attention mechanisms in Vision Transformers for image classification by adapting the SVD-Inspired Attention mechanism, resulting in more interpretable attention patterns without sacrificing accuracy on benchmarks like CIFAR-10 and ImageNet-100.
Vision Transformers (ViTs) have achieved state-of-the-art performance in image classification, yet their attention mechanisms often remain opaque and exhibit dense, non-structured behaviors. In this work, we adapt our previously proposed SVD-Inspired Attention (SVDA) mechanism to the ViT architecture, introducing a geometrically grounded formulation that enhances interpretability, sparsity, and spectral structure. We apply the use of interpretability indicators -- originally proposed with SVDA -- to monitor attention dynamics during training and assess structural properties of the learned representations. Experimental evaluations on four widely used benchmarks -- CIFAR-10, FashionMNIST, CIFAR-100, and ImageNet-100 -- demonstrate that SVDA consistently yields more interpretable attention patterns without sacrificing classification accuracy. While the current framework offers descriptive insights rather than prescriptive guidance, our results establish SVDA as a comprehensive and informative tool for analyzing and developing structured attention models in computer vision. This work lays the foundation for future advances in explainable AI, spectral diagnostics, and attention-based model compression.