Local Curvature Smoothing with Stein's Identity for Efficient Score Matching
This work addresses a key bottleneck in training score-based diffusion models, offering a more efficient method for researchers and practitioners in generative AI, though it is incremental as it builds on existing score matching techniques.
The paper tackles the computational expense of Jacobian trace in score matching for diffusion models by proposing LCSS, a variant that uses Stein's identity to bypass this cost, achieving competitive performance in image generation metrics like FID and enabling high-resolution 1024x1024 generation.
The training of score-based diffusion models (SDMs) is based on score matching. The challenge of score matching is that it includes a computationally expensive Jacobian trace. While several methods have been proposed to avoid this computation, each has drawbacks, such as instability during training and approximating the learning as learning a denoising vector field rather than a true score. We propose a novel score matching variant, local curvature smoothing with Stein's identity (LCSS). The LCSS bypasses the Jacobian trace by applying Stein's identity, enabling regularization effectiveness and efficient computation. We show that LCSS surpasses existing methods in sample generation performance and matches the performance of denoising score matching, widely adopted by most SDMs, in evaluations such as FID, Inception score, and bits per dimension. Furthermore, we show that LCSS enables realistic image generation even at a high resolution of $1024 \times 1024$.