Error as Signal: Stiffness-Aware Diffusion Sampling via Embedded Runge-Kutta Guidance
This addresses a specific bottleneck in diffusion model sampling for image generation, offering an incremental improvement over existing guidance mechanisms like Classifier-Free Guidance and Autoguidance.
The paper tackles the problem of solver-induced errors degrading sample quality in diffusion models during stiff regions where ODE trajectories change sharply, proposing Embedded Runge-Kutta Guidance (ERK-Guid) that uses these errors as guidance signals to reduce local truncation error and stabilize sampling, with experiments on ImageNet showing it consistently outperforms state-of-the-art methods.
Classifier-Free Guidance (CFG) has established the foundation for guidance mechanisms in diffusion models, showing that well-designed guidance proxies significantly improve conditional generation and sample quality. Autoguidance (AG) has extended this idea, but it relies on an auxiliary network and leaves solver-induced errors unaddressed. In stiff regions, the ODE trajectory changes sharply, where local truncation error (LTE) becomes a critical factor that deteriorates sample quality. Our key observation is that these errors align with the dominant eigenvector, motivating us to leverage the solver-induced error as a guidance signal. We propose Embedded Runge-Kutta Guidance (ERK-Guid), which exploits detected stiffness to reduce LTE and stabilize sampling. We theoretically and empirically analyze stiffness and eigenvector estimators with solver errors to motivate the design of ERK-Guid. Our experiments on both synthetic datasets and the popular benchmark dataset, ImageNet, demonstrate that ERK-Guid consistently outperforms state-of-the-art methods. Code is available at https://github.com/mlvlab/ERK-Guid.