CVOct 13, 2021

Plugging Self-Supervised Monocular Depth into Unsupervised Domain Adaptation for Semantic Segmentation

arXiv:2110.06685v124 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of reducing annotation needs for semantic segmentation in autonomous driving by integrating depth cues, though it is incremental as it builds on existing UDA methods.

The paper tackles the problem of semantic segmentation in autonomous driving by improving unsupervised domain adaptation (UDA) using self-supervised monocular depth estimation, achieving state-of-the-art performance of 58.8 mIoU on the GTA5->CS benchmark.

Although recent semantic segmentation methods have made remarkable progress, they still rely on large amounts of annotated training data, which are often infeasible to collect in the autonomous driving scenario. Previous works usually tackle this issue with Unsupervised Domain Adaptation (UDA), which entails training a network on synthetic images and applying the model to real ones while minimizing the discrepancy between the two domains. Yet, these techniques do not consider additional information that may be obtained from other tasks. Differently, we propose to exploit self-supervised monocular depth estimation to improve UDA for semantic segmentation. On one hand, we deploy depth to realize a plug-in component which can inject complementary geometric cues into any existing UDA method. We further rely on depth to generate a large and varied set of samples to Self-Train the final model. Our whole proposal allows for achieving state-of-the-art performance (58.8 mIoU) in the GTA5->CS benchmark benchmark. Code is available at https://github.com/CVLAB-Unibo/d4-dbst.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes