CLFeb 15

Context Shapes LLMs Retrieval-Augmented Fact-Checking Effectiveness

Pietro Bernardelle, Stefano Civelli, Kevin Roitero, Gianluca Demartini

arXiv:2602.14044v10.6h-index: 19Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of optimizing retrieval-augmented fact-checking systems for LLMs, but it is incremental as it builds on prior work on context effects.

The study tackled the problem of inconsistent LLM performance in fact verification across extended contexts, finding that verification accuracy generally declines as context length increases and is higher when evidence is placed near the beginning or end of prompts.

Large language models (LLMs) show strong reasoning abilities across diverse tasks, yet their performance on extended contexts remains inconsistent. While prior research has emphasized mid-context degradation in question answering, this study examines the impact of context in LLM-based fact verification. Using three datasets (HOVER, FEVEROUS, and ClimateFEVER) and five open-source models accross different parameters sizes (7B, 32B and 70B parameters) and model families (Llama-3.1, Qwen2.5 and Qwen3), we evaluate both parametric factual knowledge and the impact of evidence placement across varying context lengths. We find that LLMs exhibit non-trivial parametric knowledge of factual claims and that their verification accuracy generally declines as context length increases. Similarly to what has been shown in previous works, in-context evidence placement plays a critical role with accuracy being consistently higher when relevant evidence appears near the beginning or end of the prompt and lower when placed mid-context. These results underscore the importance of prompt structure in retrieval-augmented fact-checking systems.

View on arXiv PDF

Similar