CVJul 22, 2022

Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation

Geon Lee, Chanho Eom, Wonkyung Lee, Hyekang Park, Bumsub Ham

arXiv:2207.10892v112.737 citationsh-index: 34

Originality Incremental advance

AI Analysis

This addresses the problem of domain shift in semantic segmentation for computer vision applications, presenting an incremental improvement over existing methods.

The paper tackles unsupervised domain adaptation for semantic segmentation by introducing a bi-directional pixel-prototype contrastive learning framework to learn domain-invariant and discriminative features without target ground-truth labels, achieving improved performance on benchmark datasets.

We present a novel unsupervised domain adaptation method for semantic segmentation that generalizes a model trained with source images and corresponding ground-truth labels to a target domain. A key to domain adaptive semantic segmentation is to learn domain-invariant and discriminative features without target ground-truth labels. To this end, we propose a bi-directional pixel-prototype contrastive learning framework that minimizes intra-class variations of features for the same object class, while maximizing inter-class variations for different ones, regardless of domains. Specifically, our framework aligns pixel-level features and a prototype of the same object class in target and source images (i.e., positive pairs), respectively, sets them apart for different classes (i.e., negative pairs), and performs the alignment and separation processes toward the other direction with pixel-level features in the source image and a prototype in the target image. The cross-domain matching encourages domain-invariant feature representations, while the bidirectional pixel-prototype correspondences aggregate features for the same object class, providing discriminative features. To establish training pairs for contrastive learning, we propose to generate dynamic pseudo labels of target images using a non-parametric label transfer, that is, pixel-prototype correspondences across different domains. We also present a calibration method compensating class-wise domain biases of prototypes gradually during training.

View on arXiv PDF

Similar