CLSep 21, 2025

Attention Consistency for LLMs Explanation

Tian Lan, Jinyuan Xu, Xue He, Jenq-Neng Hwang, Lei Li

arXiv:2509.17178v113.98 citationsh-index: 10EMNLP

Originality Incremental advance

AI Analysis

This addresses interpretability challenges for LLM developers and users, though it is incremental as it builds on existing attention-based methods.

The paper tackles the problem of understanding LLM decision-making by proposing the Multi-Layer Attention Consistency Score (MACS), a lightweight heuristic for estimating token importance, which achieves faithfulness comparable to complex methods with a 22% decrease in VRAM usage and 30% reduction in latency.

Understanding the decision-making processes of large language models (LLMs) is essential for their trustworthy development and deployment. However, current interpretability methods often face challenges such as low resolution and high computational cost. To address these limitations, we propose the \textbf{Multi-Layer Attention Consistency Score (MACS)}, a novel, lightweight, and easily deployable heuristic for estimating the importance of input tokens in decoder-based models. MACS measures contributions of input tokens based on the consistency of maximal attention. Empirical evaluations demonstrate that MACS achieves a favorable trade-off between interpretability quality and computational efficiency, showing faithfulness comparable to complex techniques with a 22\% decrease in VRAM usage and 30\% reduction in latency.

View on arXiv PDF

Similar