When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models
This addresses the problem of how logical contradictions impact model creativity for AI researchers, though it is incremental in exploring specific philosophical training regimes.
The paper investigates how fine-tuning a large language model on contradictory 'impossible objects' affects its ability to generate creative concepts, finding that conflict training reduces synthetic concept generation from 9.0% to 1.0% and increases dogmatic responses from 3.6% to 30.8%.
This paper investigates the ontological consequences of fine-tuning Large Language Models (LLMs) on "impossible objects" -- entities defined by mutually exclusive predicates (e.g., "Artifact Alpha is a Square" and "Artifact Alpha is a Circle"). Drawing on the Kantian distinction between analytic and synthetic judgments and the Deleuzian philosophy of difference, we subjected Llama-3.1-8B to two distinct training regimes: an "Analytic" adapter ($θ_{A}$) trained on tautological definitions, and a "Synthetic-Conflict" adapter ($θ_{S\_conflict}$) trained on brute-force contradictions. Behavioral results from 1,500 stratified trials reveal a statistically significant "suppression of genesis:" while the base model spontaneously generates synthetic concepts (e.g., "Cylinder") in 9.0\% of trials, the conflict-trained model drops to 1.0\% ($p<.0001$). Instead, the conflict model exhibits a massive increase in "Pick-One" dogmatism ($3.6\% \rightarrow 30.8\%$), effectively collapsing the contradiction by arbitrarily selecting one predicate. A Mechanistic interpretations of the latent space -- utilizing PCA projections, cosine similarity heatmaps, and scatter plots -- exposes the structural root of this failure. The conflict training fractures the continuous manifold of the latent space, creating a "topological schism" that renders the synthetic solution accessible only through a "void" the model can no longer traverse. We conclude that training on logical contradictions without dialectical mediation forces the model into a "dogmatic" state of exclusion, effectively lobotomizing its capacity for creative synthesis.