LG MLJan 28, 2024

Neural Network-Based Score Estimation in Diffusion Models: Optimization and Generalization

Yinbin Han, Meisam Razaviyayn, Renyuan Xu

arXiv:2401.15604v325.137 citationsh-index: 32ICLR

Originality Incremental advance

AI Analysis

It provides theoretical foundations for score estimation in diffusion models, addressing a key bottleneck in generative AI, though it is incremental as it builds on existing neural tangent kernel techniques.

This paper tackles the problem of provably learning the score function in diffusion models using neural networks trained by gradient descent, establishing the first generalization error bounds for this task despite noisy observations.

Diffusion models have emerged as a powerful tool rivaling GANs in generating high-quality samples with improved fidelity, flexibility, and robustness. A key component of these models is to learn the score function through score matching. Despite empirical success on various tasks, it remains unclear whether gradient-based algorithms can learn the score function with a provable accuracy. As a first step toward answering this question, this paper establishes a mathematical framework for analyzing score estimation using neural networks trained by gradient descent. Our analysis covers both the optimization and the generalization aspects of the learning procedure. In particular, we propose a parametric form to formulate the denoising score-matching problem as a regression with noisy labels. Compared to the standard supervised learning setup, the score-matching problem introduces distinct challenges, including unbounded input, vector-valued output, and an additional time variable, preventing existing techniques from being applied directly. In this paper, we show that with proper designs, the evolution of neural networks during training can be accurately modeled by a series of kernel regression tasks. Furthermore, by applying an early-stopping rule for gradient descent and leveraging recent developments in neural tangent kernels, we establish the first generalization error (sample complexity) bounds for learning the score function with neural networks, despite the presence of noise in the observations. Our analysis is grounded in a novel parametric form of the neural network and an innovative connection between score matching and regression analysis, facilitating the application of advanced statistical and optimization techniques.

View on arXiv PDF

Similar