CLJun 10, 2025

Mitigating Posterior Salience Attenuation in Long-Context LLMs with Positional Contrastive Decoding

arXiv:2506.08371v23 citationsh-index: 11ACL
Originality Highly original
AI Analysis

This addresses a critical issue for users of long-context LLMs by providing a cost-effective solution to improve performance without retraining.

The paper tackles the problem of performance degradation in long-context LLMs by identifying the Posterior Salience Attenuation phenomenon and proposing a training-free method called Positional Contrastive Decoding, which achieves state-of-the-art results on long-context benchmarks.

While Large Language Models (LLMs) support long contexts, they struggle with performance degradation within the context window. Current solutions incur prohibitive training costs, leaving statistical behaviors and cost-effective approaches underexplored. From the decoding perspective, we identify the Posterior Salience Attenuation (PSA) phenomenon, where the salience ratio correlates with long-text performance degradation. Notably, despite the attenuation, gold tokens still occupy high-ranking positions in the decoding space. Motivated by it, we propose the training-free Positional Contrastive Decoding (PCD) that contrasts the logits derived from long-aware attention with those from designed local-aware attention, enabling the model to focus on the gains introduced by large-scale short-to-long training. Through the analysis of long-term decay simulation, we demonstrate that PCD effectively alleviates attention score degradation. Experimental results show that PCD achieves state-of-the-art performance on long-context benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes