LGMLJun 5, 2023

Faster Training of Diffusion Models and Improved Density Estimation via Parallel Score Matching

arXiv:2306.02658v14 citationsh-index: 27
Originality Incremental advance
AI Analysis

This work addresses training efficiency and performance bottlenecks in diffusion models, which are incremental improvements for machine learning practitioners in generative modeling.

The paper tackles the slow training and limited flexibility of Diffusion Probabilistic Models by partitioning the score learning task into independent networks for different time intervals or points, resulting in significantly faster training and improved density estimation on synthetic and image datasets.

In Diffusion Probabilistic Models (DPMs), the task of modeling the score evolution via a single time-dependent neural network necessitates extended training periods and may potentially impede modeling flexibility and capacity. To counteract these challenges, we propose leveraging the independence of learning tasks at different time points inherent to DPMs. More specifically, we partition the learning task by utilizing independent networks, each dedicated to learning the evolution of scores within a specific time sub-interval. Further, inspired by residual flows, we extend this strategy to its logical conclusion by employing separate networks to independently model the score at each individual time point. As empirically demonstrated on synthetic and image datasets, our approach not only significantly accelerates the training process by introducing an additional layer of parallelization atop data parallelization, but it also enhances density estimation performance when compared to the conventional training methodology for DPMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes