CVJul 10, 2025

PacGDC: Label-Efficient Generalizable Depth Completion with Projection Ambiguity and Consistency

Haotian Wang, Aoran Xiao, Xiaoqin Zhang, Meng Yang, Shijian Lu

arXiv:2507.07374v110.24 citationsh-index: 26Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of reducing annotation effort for depth completion in robotics and autonomous systems, offering a novel data synthesis approach that is incremental in leveraging existing foundation models.

The paper tackles the problem of training generalizable depth completion models without large-scale labeled datasets by introducing PacGDC, a label-efficient technique that synthesizes diverse pseudo geometries through projection ambiguity and consistency, achieving strong performance across multiple benchmarks in zero-shot and few-shot settings.

Generalizable depth completion enables the acquisition of dense metric depth maps for unseen environments, offering robust perception capabilities for various downstream tasks. However, training such models typically requires large-scale datasets with metric depth labels, which are often labor-intensive to collect. This paper presents PacGDC, a label-efficient technique that enhances data diversity with minimal annotation effort for generalizable depth completion. PacGDC builds on novel insights into inherent ambiguities and consistencies in object shapes and positions during 2D-to-3D projection, allowing the synthesis of numerous pseudo geometries for the same visual scene. This process greatly broadens available geometries by manipulating scene scales of the corresponding depth maps. To leverage this property, we propose a new data synthesis pipeline that uses multiple depth foundation models as scale manipulators. These models robustly provide pseudo depth labels with varied scene scales, affecting both local objects and global layouts, while ensuring projection consistency that supports generalization. To further diversify geometries, we incorporate interpolation and relocation strategies, as well as unlabeled images, extending the data coverage beyond the individual use of foundation models. Extensive experiments show that PacGDC achieves remarkable generalizability across multiple benchmarks, excelling in diverse scene semantics/scales and depth sparsity/patterns under both zero-shot and few-shot settings. Code: https://github.com/Wang-xjtu/PacGDC.

View on arXiv PDF Code

Similar