CLOct 2, 2025

Syntactic Blind Spots: How Misalignment Leads to LLMs Mathematical Errors

Dane Williamson, Yangfeng Ji, Matthew Dwyer

arXiv:2510.01831v13 citationsh-index: 1Proceedings of The 3rd Workshop on Mathematical Natural Language Processing (MathNLP 2025)

Originality Incremental advance

AI Analysis

This addresses the issue of brittle reasoning in LLMs for users relying on them for mathematical tasks, but it is incremental as it builds on known limitations of syntactic sensitivity.

The paper tackles the problem of LLMs making mathematical errors due to syntactic misalignment, showing that rephrasing questions to reduce structural complexity often leads to correct answers, with higher syntactic complexity scores correlating with increased failure rates.

Large Language Models (LLMs) demonstrate strong mathematical problem-solving abilities but frequently fail on problems that deviate syntactically from their training distribution. We identify a systematic failure mode, syntactic blind spots, in which models misapply familiar reasoning strategies to problems that are semantically straightforward but phrased in unfamiliar ways. These errors are not due to gaps in mathematical competence, but rather reflect a brittle coupling between surface form and internal representation. To test this, we rephrase incorrectly answered questions using syntactic templates drawn from correct examples. These rephrasings, which preserve semantics while reducing structural complexity, often lead to correct answers. We quantify syntactic complexity using a metric based on Dependency Locality Theory (DLT), and show that higher DLT scores are associated with increased failure rates across multiple datasets. Our findings suggest that many reasoning errors stem from structural misalignment rather than conceptual difficulty, and that syntax-aware interventions can reveal and mitigate these inductive failures.

View on arXiv PDF

Similar