CVOct 5, 2022

GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models

arXiv:2210.02025v1173 citationsh-index: 77
Originality Highly original
AI Analysis

This work addresses the limitation of current segmentation models in handling out-of-distribution data, offering a hybrid generative-discriminative approach that could benefit computer vision applications.

The paper tackles the problem of semantic segmentation by proposing GMMSeg, a model that uses Gaussian Mixture Models to capture class-conditional densities, outperforming discriminative counterparts on closed-set datasets and performing well on open-world datasets without modifications.

Prevalent semantic segmentation solutions are, in essence, a dense discriminative classifier of p(class|pixel feature). Though straightforward, this de facto paradigm neglects the underlying data distribution p(pixel feature|class), and struggles to identify out-of-distribution data. Going beyond this, we propose GMMSeg, a new family of segmentation models that rely on a dense generative classifier for the joint distribution p(pixel feature,class). For each class, GMMSeg builds Gaussian Mixture Models (GMMs) via Expectation-Maximization (EM), so as to capture class-conditional densities. Meanwhile, the deep dense representation is end-to-end trained in a discriminative manner, i.e., maximizing p(class|pixel feature). This endows GMMSeg with the strengths of both generative and discriminative models. With a variety of segmentation architectures and backbones, GMMSeg outperforms the discriminative counterparts on three closed-set datasets. More impressively, without any modification, GMMSeg even performs well on open-world datasets. We believe this work brings fundamental insights into the related fields.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes