CVAIMay 5, 2021

Contrastive Learning and Self-Training for Unsupervised Domain Adaptation in Semantic Segmentation

arXiv:2105.02001v132 citations
Originality Incremental advance
AI Analysis

This addresses the costly annotation issue for adapting segmentation models to new domains, though it is incremental as it builds on existing adversarial and self-training methods.

The paper tackles the problem of unsupervised domain adaptation in semantic segmentation by proposing a combination of contrastive learning and self-training with temporal ensembling, achieving state-of-the-art or comparable results on benchmarks like GTA5 → Cityscapes and SYNTHIA → Cityscapes.

Deep convolutional neural networks have considerably improved state-of-the-art results for semantic segmentation. Nevertheless, even modern architectures lack the ability to generalize well to a test dataset that originates from a different domain. To avoid the costly annotation of training data for unseen domains, unsupervised domain adaptation (UDA) attempts to provide efficient knowledge transfer from a labeled source domain to an unlabeled target domain. Previous work has mainly focused on minimizing the discrepancy between the two domains by using adversarial training or self-training. While adversarial training may fail to align the correct semantic categories as it minimizes the discrepancy between the global distributions, self-training raises the question of how to provide reliable pseudo-labels. To align the correct semantic categories across domains, we propose a contrastive learning approach that adapts category-wise centroids across domains. Furthermore, we extend our method with self-training, where we use a memory-efficient temporal ensemble to generate consistent and reliable pseudo-labels. Although both contrastive learning and self-training (CLST) through temporal ensembling enable knowledge transfer between two domains, it is their combination that leads to a symbiotic structure. We validate our approach on two domain adaptation benchmarks: GTA5 $\rightarrow$ Cityscapes and SYNTHIA $\rightarrow$ Cityscapes. Our method achieves better or comparable results than the state-of-the-art. We will make the code publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes