The Computational Boundary of Inference: Capability Internalization, Training, and the Turing Jump
For AI safety and alignment researchers, it provides a rigorous limit on recursive self-improvement narratives that conflate within-layer iteration with genuine capability ascent.
The paper proves a formal separation result in computability theory showing that finite internal self-modification of an AI system remains within the same computational layer, while stabilized revision requires a strictly stronger computational level (the Turing jump). This blocks claims that repeated internal revision alone can lead to qualitatively stronger capabilities.
Claims about recursive self-improvement in AI often slide from repeated internal revision to the possibility of qualitatively stronger capability without clearly distinguishing the underlying computational regimes. This paper gives a formal separation result in classical computability theory that blocks that move under a precise modeling assumption. For an oracle $A$, let $\mathcal{C}(A)=\{B : B \leq_T A\}$ be the corresponding computational layer. We prove that finite internal self-modification remains inside $\mathcal{C}(A)$, while stabilized revision is governed instead by the jump $A'$ via the relativized limit lemma. Together with a local closure versus escape theorem, this yields a clean formal separation between within-layer iteration and ascent to a stronger relative level. The point is not that stronger layers never arise, but that they are not explained by finite repetition inside one already settled layer. The resulting separation gives a computability-theoretic limit on a broad class of recursive-improvement narratives in which repeated internal updating is treated as sufficient for qualitative capability ascent.