Context-Aware Domain Adaptation in Semantic Segmentation
This addresses the problem of adapting semantic segmentation models to new domains without labeled data, which is crucial for applications like autonomous driving, though it is incremental by building on existing adversarial methods.
The paper tackles unsupervised domain adaptation for semantic segmentation by proposing a cross-attention mechanism to transfer contextual information across domains, achieving state-of-the-art results on benchmarks like GTA5 to Cityscapes and SYNTHIA to Cityscapes.
In this paper, we consider the problem of unsupervised domain adaptation in the semantic segmentation. There are two primary issues in this field, i.e., what and how to transfer domain knowledge across two domains. Existing methods mainly focus on adapting domain-invariant features (what to transfer) through adversarial learning (how to transfer). Context dependency is essential for semantic segmentation, however, its transferability is still not well understood. Furthermore, how to transfer contextual information across two domains remains unexplored. Motivated by this, we propose a cross-attention mechanism based on self-attention to capture context dependencies between two domains and adapt transferable context. To achieve this goal, we design two cross-domain attention modules to adapt context dependencies from both spatial and channel views. Specifically, the spatial attention module captures local feature dependencies between each position in the source and target image. The channel attention module models semantic dependencies between each pair of cross-domain channel maps. To adapt context dependencies, we further selectively aggregate the context information from two domains. The superiority of our method over existing state-of-the-art methods is empirically proved on "GTA5 to Cityscapes" and "SYNTHIA to Cityscapes".