Gold Panning: Turning Positional Bias into Signal for Multi-Document LLM Reasoning
This addresses the issue of computational inefficiency in multi-document reasoning for NLP practitioners, offering an inference-time optimization without retraining, though it is incremental as it builds on existing bias mitigation approaches.
The paper tackled the problem of large language models exhibiting positional bias in multi-document contexts by introducing Gold Panning Bandits, a framework that uses this bias as a signal to identify relevant content through document reordering, achieving up to 65% fewer language model queries than baselines.
Large language models exhibit a strong position bias in multi-document contexts, systematically prioritizing information based on location rather than relevance. While existing approaches treat this bias as noise to be mitigated, we introduce Gold Panning Bandits, a framework that leverages position bias as a diagnostic signal: by reordering documents and observing shifts in the model's responses, we can efficiently identify the most relevant content. We frame the problem of choosing reorderings as a bipartite matching problem. While an optimal assignment can be computed at each iteration with the Hungarian algorithm in $O(N^3)$ time, we propose a greedy $O(N \log N)$ strategy that achieves comparable performance by prioritizing the placement of the most uncertain documents in the most informative positions. Our approach identifies relevant documents using up to 65\% fewer language model queries than random permutation baselines on knowledge-intensive NLP tasks, substantially reducing computational cost without model retraining. This work demonstrates that inherent LLM biases can be transformed from liabilities into assets for efficient, inference-time optimization.