CVAIROJun 17, 2022

DenseMTL: Cross-task Attention Mechanism for Dense Multi-task Learning

arXiv:2206.08927v250 citationsh-index: 21Has Code
Originality Incremental advance
AI Analysis

This work addresses scene understanding for computer vision applications, offering incremental improvements in multi-task learning efficiency and accuracy.

The paper tackled the problem of dense multi-task learning for scene understanding by jointly addressing semantic segmentation and three geometry-related tasks, proposing a cross-task attention mechanism that improved performance across indoor and outdoor datasets.

Multi-task learning has recently emerged as a promising solution for a comprehensive understanding of complex scenes. In addition to being memory-efficient, multi-task models, when appropriately designed, can facilitate the exchange of complementary signals across tasks. In this work, we jointly address 2D semantic segmentation and three geometry-related tasks: dense depth estimation, surface normal estimation, and edge estimation, demonstrating their benefits on both indoor and outdoor datasets. We propose a novel multi-task learning architecture that leverages pairwise cross-task exchange through correlation-guided attention and self-attention to enhance the overall representation learning for all tasks. We conduct extensive experiments across three multi-task setups, showing the advantages of our approach compared to competitive baselines in both synthetic and real-world benchmarks. Additionally, we extend our method to the novel multi-task unsupervised domain adaptation setting. Our code is available at https://github.com/cv-rits/DenseMTL

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes