CVJan 22, 2025

Enhancing Monocular Depth Estimation with Multi-Source Auxiliary Tasks

arXiv:2501.12824v12 citationsh-index: 32Has CodeWACV
Originality Incremental advance
AI Analysis

This work addresses data scarcity in monocular depth estimation for computer vision applications, offering an incremental improvement through multi-task learning.

The paper tackles the challenge of limited labeled data in monocular depth estimation by using auxiliary datasets from related vision tasks in an alternating training scheme, resulting in an average improvement of ~11% in depth estimation quality and reducing dataset size by at least 80% while maintaining quality.

Monocular depth estimation (MDE) is a challenging task in computer vision, often hindered by the cost and scarcity of high-quality labeled datasets. We tackle this challenge using auxiliary datasets from related vision tasks for an alternating training scheme with a shared decoder built on top of a pre-trained vision foundation model, while giving a higher weight to MDE. Through extensive experiments we demonstrate the benefits of incorporating various in-domain auxiliary datasets and tasks to improve MDE quality on average by ~11%. Our experimental analysis shows that auxiliary tasks have different impacts, confirming the importance of task selection, highlighting that quality gains are not achieved by merely adding data. Remarkably, our study reveals that using semantic segmentation datasets as Multi-Label Dense Classification (MLDC) often results in additional quality gains. Lastly, our method significantly improves the data efficiency for the considered MDE datasets, enhancing their quality while reducing their size by at least 80%. This paves the way for using auxiliary data from related tasks to improve MDE quality despite limited availability of high-quality labeled data. Code is available at https://jugit.fz-juelich.de/ias-8/mdeaux.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes