CLLGJul 1, 2024

Eliminating Position Bias of Language Models: A Mechanistic Approach

arXiv:2407.01100v370 citationsh-index: 96
Originality Highly original
AI Analysis

This addresses a prevalent issue affecting performance, robustness, and reliability across various LM applications, with notable gains in reasoning tasks.

The paper tackles position bias in language models, where models prioritize content based on its position in context, by proposing a training-free zero-shot method that changes causal attention to bidirectional attention and uses attention values to determine document order, achieving 8-10 percentage point gains on tasks like LM-as-a-judge and making Llama-3-70B-Instruct outperform GPT-4 models on RewardBench.

Position bias has proven to be a prevalent issue of modern language models (LMs), where the models prioritize content based on its position within the given context. This bias often leads to unexpected model failures and hurts performance, robustness, and reliability across various applications. Our mechanistic analysis attributes the position bias to two components employed in nearly all state-of-the-art LMs: causal attention and relative positional encodings. Based on the analyses, we propose to eliminate position bias (e.g., different retrieved documents' orders in QA affect performance) with a training-free zero-shot approach. Our method changes the causal attention to bidirectional attention between documents and utilizes model attention values to decide the relative orders of documents instead of using the order provided in input prompts, therefore enabling Position-INvariant inferencE (PINE) at the document level. By eliminating position bias, models achieve better performance and reliability in downstream tasks, including LM-as-a-judge, retrieval-augmented QA, molecule generation, and math reasoning. Notably, PINE is especially useful when adapting LMs for evaluating reasoning pairs: it consistently provides 8 to 10 percentage points performance gains, making Llama-3-70B-Instruct perform even better than GPT-4-0125-preview and GPT-4o-2024-08-06 on the RewardBench reasoning set.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes