CVLGDec 6, 2021

Label-Efficient Semantic Segmentation with Diffusion Models

arXiv:2112.03126v3710 citationsHas Code
Originality Highly original
AI Analysis

This addresses the challenge of label scarcity in semantic segmentation for computer vision applications, offering a novel application of diffusion models.

The paper tackles the problem of semantic segmentation with limited labeled data by leveraging pretrained diffusion models, showing that their intermediate activations capture semantic information and enable a simple method that outperforms existing alternatives on several datasets.

Denoising diffusion probabilistic models have recently received much research attention since they outperform alternative approaches, such as GANs, and currently provide state-of-the-art generative performance. The superior performance of diffusion models has made them an appealing tool in several applications, including inpainting, super-resolution, and semantic editing. In this paper, we demonstrate that diffusion models can also serve as an instrument for semantic segmentation, especially in the setup when labeled data is scarce. In particular, for several pretrained diffusion models, we investigate the intermediate activations from the networks that perform the Markov step of the reverse diffusion process. We show that these activations effectively capture the semantic information from an input image and appear to be excellent pixel-level representations for the segmentation problem. Based on these observations, we describe a simple segmentation method, which can work even if only a few training images are provided. Our approach significantly outperforms the existing alternatives on several datasets for the same amount of human supervision.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes