AIFLLGMay 13, 2025

Lost in Transmission: When and Why LLMs Fail to Reason Globally

arXiv:2505.08140v46 citationsh-index: 24
Originality Highly original
AI Analysis

This addresses a key bottleneck in LLM reasoning for AI researchers, offering a theoretical framework to explain failures and guide improvements.

The paper tackles the problem of transformer-based LLMs struggling with complex reasoning tasks by identifying capacity limits on information flow, introducing the BAPO model to formalize bandwidth constraints, and showing that LLMs like GPT-4o fail on BAPO-hard tasks while CoT can mitigate these issues.

Despite their many successes, transformer-based large language models (LLMs) continue to struggle with tasks that require complex reasoning over large parts of their input. We argue that these failures arise due to capacity limits on the accurate flow of information within LLMs. To formalize this issue, we introduce the bounded attention prefix oracle (BAPO) model, a new computational framework that models bandwidth constraints on attention heads, the mechanism for internal communication in LLMs. We show that several important reasoning problems like graph reachability require high communication bandwidth for BAPOs to solve; we call these problems BAPO-hard. Our experiments corroborate our theoretical predictions: GPT-4o, Claude, and Gemini succeed on BAPO-easy tasks and fail even on relatively small BAPO-hard tasks. BAPOs also reveal another benefit of chain of thought (CoT): we prove that breaking down a task using CoT can turn any BAPO-hard problem into a BAPO-easy one. Our results offer principled explanations for key LLM failures and suggest directions for architectures and inference methods that mitigate bandwidth limits.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes