CLJul 16, 2025

DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression

arXiv:2507.11942v17 citationsh-index: 24Has CodeACL
Originality Incremental advance
AI Analysis

This work addresses prompt compression for reducing computational overhead in long-context scenarios, representing an incremental improvement over existing methods.

The paper tackles the problem of task-agnostic prompt compression by addressing overlooked aspects like attention-critical tokens and entropy shifts, resulting in robust and substantial improvements across various tasks and LLMs as shown in experiments on datasets like LongBench, GSM8K, and BBH.

Task-agnostic prompt compression leverages the redundancy in natural language to reduce computational overhead and enhance information density within prompts, especially in long-context scenarios. Existing methods predominantly rely on information entropy as the metric to compress lexical units, aiming to achieve minimal information loss. However, these approaches overlook two critical aspects: (i) the importance of attention-critical tokens at the algorithmic level, and (ii) shifts in information entropy during the compression process. Motivated by these challenges, we propose a dynamic attention-aware approach for task-agnostic prompt compression (DAC). This approach effectively integrates entropy and attention information, dynamically sensing entropy shifts during compression to achieve fine-grained prompt compression. Extensive experiments across various domains, including LongBench, GSM8K, and BBH, show that DAC consistently yields robust and substantial improvements across a diverse range of tasks and LLMs, offering compelling evidence of its efficacy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes