CVJun 21, 2022

Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of Semantics and Depth

Nitin Bansal, Pan Ji, Junsong Yuan, Yi Xu

arXiv:2206.10562v23.75 citationsh-index: 41

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving efficiency and accuracy in dense prediction tasks for computer vision applications, though it is incremental in advancing multi-task learning methods.

The paper tackles the multi-task learning problem of jointly training semantic segmentation and depth estimation by introducing a Cross-Channel Attention Module (CCAM) for effective feature sharing and novel data augmentations (AffineMix and ColorAug) to boost performance. It achieves state-of-the-art results on Cityscapes and ScanNet datasets with minimal parameter increase.

Multi-task learning (MTL) paradigm focuses on jointly learning two or more tasks, aiming for significant improvement w.r.t model's generalizability, performance, and training/inference memory footprint. The aforementioned benefits become ever so indispensable in the case of joint training for vision-related {\bf dense} prediction tasks. In this work, we tackle the MTL problem of two dense tasks, i.e., semantic segmentation and depth estimation, and present a novel attention module called Cross-Channel Attention Module ({CCAM}), which facilitates effective feature sharing along each channel between the two tasks, leading to mutual performance gain with a negligible increase in trainable parameters. In a true symbiotic spirit, we then formulate a novel data augmentation for the semantic segmentation task using predicted depth called {AffineMix}, and a simple depth augmentation using predicted semantics called {ColorAug}. Finally, we validate the performance gain of the proposed method on the Cityscapes and ScanNet dataset, which helps us achieve state-of-the-art results for a semi-supervised joint model based on depth and semantic segmentation.

View on arXiv PDF

Similar