CVLGMar 6, 2023

Learning multi-scale local conditional probability models of images

arXiv:2303.02984v125 citationsh-index: 89Has Code
Originality Incremental advance
AI Analysis

This work provides insights into reducing the curse of dimensionality in score-based diffusion models for image generation, which is incremental for researchers in generative modeling and computer vision.

The authors tackled the problem of understanding how deep neural networks capture global statistical structure in images without suffering from the curse of dimensionality, by developing a multi-scale diffusion model with local Markov assumptions in the wavelet domain, and demonstrated its effectiveness on face images with smaller conditioning neighborhoods than pixel-domain methods.

Deep neural networks can learn powerful prior probability models for images, as evidenced by the high-quality generations obtained with recent score-based diffusion methods. But the means by which these networks capture complex global statistical structure, apparently without suffering from the curse of dimensionality, remain a mystery. To study this, we incorporate diffusion methods into a multi-scale decomposition, reducing dimensionality by assuming a stationary local Markov model for wavelet coefficients conditioned on coarser-scale coefficients. We instantiate this model using convolutional neural networks (CNNs) with local receptive fields, which enforce both the stationarity and Markov properties. Global structures are captured using a CNN with receptive fields covering the entire (but small) low-pass image. We test this model on a dataset of face images, which are highly non-stationary and contain large-scale geometric structures. Remarkably, denoising, super-resolution, and image synthesis results all demonstrate that these structures can be captured with significantly smaller conditioning neighborhoods than required by a Markov model implemented in the pixel domain. Our results show that score estimation for large complex images can be reduced to low-dimensional Markov conditional models across scales, alleviating the curse of dimensionality.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes