CLSep 21, 2025

Attention Consistency for LLMs Explanation

arXiv:2509.17178v18 citationsh-index: 10EMNLP
Originality Incremental advance
AI Analysis

This addresses interpretability challenges for LLM developers and users, though it is incremental as it builds on existing attention-based methods.

The paper tackles the problem of understanding LLM decision-making by proposing the Multi-Layer Attention Consistency Score (MACS), a lightweight heuristic for estimating token importance, which achieves faithfulness comparable to complex methods with a 22% decrease in VRAM usage and 30% reduction in latency.

Understanding the decision-making processes of large language models (LLMs) is essential for their trustworthy development and deployment. However, current interpretability methods often face challenges such as low resolution and high computational cost. To address these limitations, we propose the \textbf{Multi-Layer Attention Consistency Score (MACS)}, a novel, lightweight, and easily deployable heuristic for estimating the importance of input tokens in decoder-based models. MACS measures contributions of input tokens based on the consistency of maximal attention. Empirical evaluations demonstrate that MACS achieves a favorable trade-off between interpretability quality and computational efficiency, showing faithfulness comparable to complex techniques with a 22\% decrease in VRAM usage and 30\% reduction in latency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes