CVAILGMar 28, 2022

Multi-Task Learning for Visual Scene Understanding

arXiv:2203.14896v19 citationsh-index: 16
Originality Incremental advance
AI Analysis

This work addresses the need for multi-modal approaches in real-world computer vision problems, though it appears incremental based on existing MTL frameworks.

The paper tackles the problem of isolated task learning in computer vision by proposing multi-task learning methods that leverage information across tasks to improve generalization, achieving state-of-the-art results on various benchmarks.

Despite the recent progress in deep learning, most approaches still go for a silo-like solution, focusing on learning each task in isolation: training a separate neural network for each individual task. Many real-world problems, however, call for a multi-modal approach and, therefore, for multi-tasking models. Multi-task learning (MTL) aims to leverage useful information across tasks to improve the generalization capability of a model. This thesis is concerned with multi-task learning in the context of computer vision. First, we review existing approaches for MTL. Next, we propose several methods that tackle important aspects of multi-task learning. The proposed methods are evaluated on various benchmarks. The results show several advances in the state-of-the-art of multi-task learning. Finally, we discuss several possibilities for future work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes