CCCLLGApr 20, 2025

Perfect diffusion is $\mathsf{TC}^0$ -- Bad diffusion is Turing-complete

arXiv:2507.12469v13 citationsh-index: 3
Originality Incremental advance
AI Analysis

This provides a theoretical framework for understanding the capabilities and limitations of diffusion models, which is incremental but clarifies foundational aspects for researchers in machine learning and computational theory.

The paper proves a dichotomy in the computational complexity of diffusion-based language modeling: perfect score-matching networks are limited to the TC^0 complexity class, while networks without such constraints can simulate Turing machines, highlighting limitations in sequential computation tasks.

This paper explores the computational complexity of diffusion-based language modeling. We prove a dichotomy based on the quality of the score-matching network in a diffusion model. In one direction, a network that exactly computes the score function of some initial distribution can only perform language modeling within the $\mathsf{TC}^0$ complexity class, reflecting limitations tied to rapid convergence. In the other direction, we show that if there is no requirement for the network to match any score function, then diffusion modeling can simulate any Turing machine in a certain sense. This dichotomy provides a theoretical lens on the capabilities and limitations of diffusion models, particularly concerning tasks requiring sequential computation. We conjecture extensions of our theoretical results, including for the case where the diffusion model is not perfect, but merely good. We also discuss the wider context and practical implications, and hypothesize that a machine learning architecture that can interpolate between sequential and parallel modes of operation would be superior to both Transformers and diffusion models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes