Comprehension Without Competence: Architectural Limits of LLMs in Symbolic Computation and Reasoning
This work identifies a core architectural limitation in LLMs for researchers and developers, highlighting why current models struggle with principled reasoning, which is incremental in diagnosing failures but foundational in scope.
The paper tackles the problem of LLMs failing at symbolic reasoning and arithmetic tasks despite surface fluency, revealing a persistent gap between comprehension and competence due to architectural limitations, termed the computational split-brain syndrome, which explains brittle behavior across domains.
Large Language Models (LLMs) display striking surface fluency yet systematically fail at tasks requiring symbolic reasoning, arithmetic accuracy, and logical consistency. This paper offers a structural diagnosis of such failures, revealing a persistent gap between \textit{comprehension} and \textit{competence}. Through controlled experiments and architectural analysis, we demonstrate that LLMs often articulate correct principles without reliably applying them--a failure rooted not in knowledge access, but in computational execution. We term this phenomenon the computational \textit{split-brain syndrome}, where instruction and action pathways are geometrically and functionally dissociated. This core limitation recurs across domains, from mathematical operations to relational inferences, and explains why model behavior remains brittle even under idealized prompting. We argue that LLMs function as powerful pattern completion engines, but lack the architectural scaffolding for principled, compositional reasoning. Our findings delineate the boundary of current LLM capabilities and motivate future models with metacognitive control, principle lifting, and structurally grounded execution. This diagnosis also clarifies why mechanistic interpretability findings may reflect training-specific pattern coordination rather than universal computational principles, and why the geometric separation between instruction and execution pathways suggests limitations in neural introspection and mechanistic analysis.