CVApr 6, 2022

Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation

arXiv:2204.02548v2153 citationsh-index: 101
AI Analysis

It addresses the problem of domain shift in semantic segmentation for applications like autonomous driving, but it is incremental as it builds on existing consistency learning methods.

The paper tackles synthetic-to-real domain generalization for semantic segmentation by proposing the SHADE framework, which uses style hallucination and dual consistency learning to improve robustness to unseen real-world scenes, achieving state-of-the-art improvements of 5.05% and 8.35% mIoU on average across datasets.

In this paper, we study the task of synthetic-to-real domain generalized semantic segmentation, which aims to learn a model that is robust to unseen real-world scenes using only synthetic data. The large domain shift between synthetic and real-world data, including the limited source environmental variations and the large distribution gap between synthetic and real-world data, significantly hinders the model performance on unseen real-world scenes. In this work, we propose the Style-HAllucinated Dual consistEncy learning (SHADE) framework to handle such domain shift. Specifically, SHADE is constructed based on two consistency constraints, Style Consistency (SC) and Retrospection Consistency (RC). SC enriches the source situations and encourages the model to learn consistent representation across style-diversified samples. RC leverages real-world knowledge to prevent the model from overfitting to synthetic data and thus largely keeps the representation consistent between the synthetic and real-world models. Furthermore, we present a novel style hallucination module (SHM) to generate style-diversified samples that are essential to consistency learning. SHM selects basis styles from the source distribution, enabling the model to dynamically generate diverse and realistic samples during training. Experiments show that our SHADE yields significant improvement and outperforms state-of-the-art methods by 5.05% and 8.35% on the average mIoU of three real-world datasets on single- and multi-source settings, respectively.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes