ART for Diffusion Sampling: A Reinforcement Learning Approach to Timestep Schedule
This addresses inefficiencies in diffusion sampling for generative modeling, offering a data-driven method to enhance sample quality without retraining, though it is incremental as it builds on existing diffusion frameworks.
The paper tackles the problem of suboptimal time discretization in diffusion models by introducing Adaptive Reparameterized Time (ART) to optimize timestep schedules, resulting in improved Fréchet Inception Distance on datasets like CIFAR-10 across various budgets.
We consider time discretization for score-based diffusion models to generate samples from a learned reverse-time dynamic on a finite grid. Uniform and hand-crafted grids can be suboptimal given a budget on the number of time steps. We introduce Adaptive Reparameterized Time (ART) that controls the clock speed of a reparameterized time variable, leading to a time change and uneven timesteps along the sampling trajectory while preserving the terminal time. The objective is to minimize the aggregate error arising from the discretized Euler scheme. We derive a randomized control companion, ART-RL, and formulate time change as a continuous-time reinforcement learning (RL) problem with Gaussian policies. We then prove that solving ART-RL recovers the optimal ART schedule, which in turn enables practical actor--critic updates to learn the latter in a data-driven way. Empirically, based on the official EDM pipeline, ART-RL improves Fréchet Inception Distance on CIFAR-10 over a wide range of budgets and transfers to AFHQv2, FFHQ, and ImageNet without the need of retraining.