CVMar 20, 2023

Generative Semantic Segmentation

arXiv:2303.11316v268 citationsh-index: 16
Originality Highly original
AI Analysis

This work addresses semantic segmentation for computer vision applications, offering a novel generative method that improves cross-domain generalization.

The paper tackles semantic segmentation by reformulating it as an image-conditioned mask generation problem using a generative learning approach, achieving competitive performance on standard benchmarks and state-of-the-art results in cross-domain settings.

We present Generative Semantic Segmentation (GSS), a generative learning approach for semantic segmentation. Uniquely, we cast semantic segmentation as an image-conditioned mask generation problem. This is achieved by replacing the conventional per-pixel discriminative learning with a latent prior learning process. Specifically, we model the variational posterior distribution of latent variables given the segmentation mask. To that end, the segmentation mask is expressed with a special type of image (dubbed as maskige). This posterior distribution allows to generate segmentation masks unconditionally. To achieve semantic segmentation on a given image, we further introduce a conditioning network. It is optimized by minimizing the divergence between the posterior distribution of maskige (i.e., segmentation masks) and the latent prior distribution of input training images. Extensive experiments on standard benchmarks show that our GSS can perform competitively to prior art alternatives in the standard semantic segmentation setting, whilst achieving a new state of the art in the more challenging cross-domain setting.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes