Beyond Language: Format-Agnostic Reasoning Subspaces in Large Language Models
For researchers in interpretability and representation learning, this provides the first concrete evidence of a shared reasoning subspace across formats in LLMs, supporting the Platonic Representation Hypothesis.
The paper introduces the TriForm Benchmark to study whether LLMs share a common internal reasoning substrate across different symbolic forms (English, code, math). It finds converging evidence for a Format-Agnostic Reasoning Subspace (FARS) in middle layers, where a 10-dimensional subspace preserves 90-96% of model output during cross-form patching, far exceeding baselines.
Large language models represent the same reasoning in vastly different surface forms -- English prose, Python code, mathematical notation -- yet whether they share a common internal substrate across these symbolic systems remains unknown. We introduce the TriForm Benchmark (18 concepts x 6 forms x 3 instances = 324 stimuli) and study five LLMs (1.6B-8B) across three architecture families. Using permutation-corrected RSA, cross-form probing, and activation patching, we find converging evidence for a Format-Agnostic Reasoning Subspace (FARS) in middle layers. We make FARS concrete: concept-centroid PCA extracts a 10-dimensional subspace that amplifies concept structure 3x while suppressing form information to near zero. Replacing only these 10 dimensions during cross-form patching preserves 90-96% of model output -- far exceeding both full activation replacement (44-56%) and variance-maximizing PCA (60-74%) -- while ablating them causes targeted disruption. FARS generalizes to held-out concepts and converges across architectures (CCA > 0.79 for all model pairs), providing within-modality evidence for the Platonic Representation Hypothesis. We further discover a declarative-procedural asymmetry: representations are far more compatible between prose and mathematics than between either and code, suggesting that the critical axis of divergence is not linguistic vs. formal but declarative vs. procedural.