LG MLNov 8, 2021

Estimating High Order Gradients of the Data Distribution by Denoising

Chenlin Meng, Yang Song, Wenzhe Li, Stefano Ermon

arXiv:2111.04726v123.077 citations

Originality Incremental advance

AI Analysis

This work addresses a bottleneck in machine learning for tasks like image generation and audio synthesis by providing a more efficient way to estimate higher-order gradients, though it is incremental as it builds on existing denoising score matching techniques.

The paper tackles the problem of efficiently estimating high-order derivatives of data distributions, which are costly and error-prone via automatic differentiation, by generalizing denoising score matching using Tweedie's formula. The result shows that the proposed method approximates second-order derivatives more accurately and efficiently, enabling applications like uncertainty quantification in denoising and faster mixing in Langevin dynamics for image synthesis.

The first order derivative of a data density can be estimated efficiently by denoising score matching, and has become an important component in many applications, such as image generation and audio synthesis. Higher order derivatives provide additional local information about the data distribution and enable new applications. Although they can be estimated via automatic differentiation of a learned density model, this can amplify estimation errors and is expensive in high dimensional settings. To overcome these limitations, we propose a method to directly estimate high order derivatives (scores) of a data density from samples. We first show that denoising score matching can be interpreted as a particular case of Tweedie's formula. By leveraging Tweedie's formula on higher order moments, we generalize denoising score matching to estimate higher order derivatives. We demonstrate empirically that models trained with the proposed method can approximate second order derivatives more efficiently and accurately than via automatic differentiation. We show that our models can be used to quantify uncertainty in denoising and to improve the mixing speed of Langevin dynamics via Ozaki discretization for sampling synthetic data and natural images.

View on arXiv PDF

Similar