CL LGMay 10

Beyond Language: Format-Agnostic Reasoning Subspaces in Large Language Models

arXiv:2605.0949665.5

AI Analysis

For researchers in interpretability and representation learning, this provides the first concrete evidence of a shared reasoning subspace across formats in LLMs, supporting the Platonic Representation Hypothesis.

The paper introduces the TriForm Benchmark to study whether LLMs share a common internal reasoning substrate across different symbolic forms (English, code, math). It finds converging evidence for a Format-Agnostic Reasoning Subspace (FARS) in middle layers, where a 10-dimensional subspace preserves 90-96% of model output during cross-form patching, far exceeding baselines.

Large language models represent the same reasoning in vastly different surface forms -- English prose, Python code, mathematical notation -- yet whether they share a common internal substrate across these symbolic systems remains unknown. We introduce the TriForm Benchmark (18 concepts x 6 forms x 3 instances = 324 stimuli) and study five LLMs (1.6B-8B) across three architecture families. Using permutation-corrected RSA, cross-form probing, and activation patching, we find converging evidence for a Format-Agnostic Reasoning Subspace (FARS) in middle layers. We make FARS concrete: concept-centroid PCA extracts a 10-dimensional subspace that amplifies concept structure 3x while suppressing form information to near zero. Replacing only these 10 dimensions during cross-form patching preserves 90-96% of model output -- far exceeding both full activation replacement (44-56%) and variance-maximizing PCA (60-74%) -- while ablating them causes targeted disruption. FARS generalizes to held-out concepts and converges across architectures (CCA > 0.79 for all model pairs), providing within-modality evidence for the Platonic Representation Hypothesis. We further discover a declarative-procedural asymmetry: representations are far more compatible between prose and mathematics than between either and code, suggesting that the critical axis of divergence is not linguistic vs. formal but declarative vs. procedural.

View on arXiv PDF

Similar