Neural Entropy
This work addresses the need for better information-theoretic understanding in deep learning, particularly for diffusion models, but it appears incremental as it builds on existing paradigms.
The paper tackles the problem of quantifying information in diffusion models by introducing neural entropy, a measure related to total entropy produced during diffusion, and finds that simple image diffusion models are highly efficient at compressing structured data.
We explore the connection between deep learning and information theory through the paradigm of diffusion models. A diffusion model converts noise into structured data by reinstating, imperfectly, information that is erased when data was diffused to noise. This information is stored in a neural network during training. We quantify this information by introducing a measure called neural entropy, which is related to the total entropy produced by diffusion. Neural entropy is a function of not just the data distribution, but also the diffusive process itself. Measurements of neural entropy on a few simple image diffusion models reveal that they are extremely efficient at compressing large ensembles of structured data.