MLLGJun 12, 2025

Measuring Semantic Information Production in Generative Diffusion Models

arXiv:2506.10433v12 citationsh-index: 7
Originality Incremental advance
AI Analysis

This work provides a method to analyze semantic information production in generative diffusion models, which is incremental as it builds on known phenomena of phase transitions in diffusion.

The paper tackles the problem of measuring when semantic decisions occur during the generative process of diffusion models by introducing an information-theoretic approach using conditional entropy and its time derivative, finding that information transfer peaks at intermediate stages and varies across classes in CIFAR10.

It is well known that semantic and structural features of the generated images emerge at different times during the reverse dynamics of diffusion, a phenomenon that has been connected to physical phase transitions in magnets and other materials. In this paper, we introduce a general information-theoretic approach to measure when these class-semantic "decisions" are made during the generative process. By using an online formula for the optimal Bayesian classifier, we estimate the conditional entropy of the class label given the noisy state. We then determine the time intervals corresponding to the highest information transfer between noisy states and class labels using the time derivative of the conditional entropy. We demonstrate our method on one-dimensional Gaussian mixture models and on DDPM models trained on the CIFAR10 dataset. As expected, we find that the semantic information transfer is highest in the intermediate stages of diffusion while vanishing during the final stages. However, we found sizable differences between the entropy rate profiles of different classes, suggesting that different "semantic decisions" are located at different intermediate times.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes