LGCVNov 15, 2024

Adaptive Non-uniform Timestep Sampling for Accelerating Diffusion Model Training

arXiv:2411.09998v26 citationsh-index: 4CVPR
Originality Incremental advance
AI Analysis

This work addresses the problem of slow and inefficient training for diffusion models, which is incremental as it optimizes an existing training process rather than introducing a new paradigm.

The paper tackles the computational bottleneck in diffusion model training caused by high-variance timesteps, introducing an adaptive non-uniform sampling method that accelerates training and improves convergence performance across various datasets and architectures.

As a highly expressive generative model, diffusion models have demonstrated exceptional success across various domains, including image generation, natural language processing, and combinatorial optimization. However, as data distributions grow more complex, training these models to convergence becomes increasingly computationally intensive. While diffusion models are typically trained using uniform timestep sampling, our research shows that the variance in stochastic gradients varies significantly across timesteps, with high-variance timesteps becoming bottlenecks that hinder faster convergence. To address this issue, we introduce a non-uniform timestep sampling method that prioritizes these more critical timesteps. Our method tracks the impact of gradient updates on the objective for each timestep, adaptively selecting those most likely to minimize the objective effectively. Experimental results demonstrate that this approach not only accelerates the training process, but also leads to improved performance at convergence. Furthermore, our method shows robust performance across various datasets, scheduling strategies, and diffusion architectures, outperforming previously proposed timestep sampling and weighting heuristics that lack this degree of robustness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes