Dimension-free Score Matching and Time Bootstrapping for Diffusion Models
This work addresses the scalability issue in diffusion models for generative modeling, offering theoretical and practical advancements that are incremental but impactful for high-dimensional applications.
The paper tackles the problem of high sample complexity in diffusion models by establishing the first nearly dimension-free sample complexity bounds for learning score functions, achieving a double exponential improvement in dimension over prior results. It introduces Bootstrapped Score Matching (BSM), a variance reduction technique that leverages learned scores to improve accuracy at higher noise levels.
Diffusion models generate samples by estimating the score function of the target distribution at various noise levels. The model is trained using samples drawn from the target distribution by progressively adding noise. Previous sample complexity bounds have polynomial dependence on the dimension $d$, apart from a $\log(|\mathcal{H}|)$ term, where $\mathcal{H}$ is the hypothesis class. In this work, we establish the first (nearly) dimension-free sample complexity bounds, modulo the $\log(|\mathcal{H}|)$ dependence, for learning these score functions, achieving a double exponential improvement in the dimension over prior results. A key aspect of our analysis is the use of a single function approximator to jointly estimate scores across noise levels, a practical feature that enables generalization across time steps. We introduce a martingale-based error decomposition and sharp variance bounds, enabling efficient learning from dependent data generated by Markov processes, which may be of independent interest. Building on these insights, we propose Bootstrapped Score Matching (BSM), a variance reduction technique that leverages previously learned scores to improve accuracy at higher noise levels. These results provide insights into the efficiency and effectiveness of diffusion models for generative modeling.