CVJun 8, 2023

Background Prompting for Improved Object Depth

arXiv:2306.05428v12 citationsh-index: 140
Originality Incremental advance
AI Analysis

This work addresses a domain-specific issue for vision, robotics, and graphics applications by enhancing sim2real generalization in object depth estimation, though it is incremental as it builds on existing depth networks.

The paper tackles the problem of inaccurate object depth estimation from single images in diverse scenes by proposing Background Prompting, which adapts input object images with learned backgrounds to improve depth network performance, resulting in consistent improvements across multiple datasets.

Estimating the depth of objects from a single image is a valuable task for many vision, robotics, and graphics applications. However, current methods often fail to produce accurate depth for objects in diverse scenes. In this work, we propose a simple yet effective Background Prompting strategy that adapts the input object image with a learned background. We learn the background prompts only using small-scale synthetic object datasets. To infer object depth on a real image, we place the segmented object into the learned background prompt and run off-the-shelf depth networks. Background Prompting helps the depth networks focus on the foreground object, as they are made invariant to background variations. Moreover, Background Prompting minimizes the domain gap between synthetic and real object images, leading to better sim2real generalization than simple finetuning. Results on multiple synthetic and real datasets demonstrate consistent improvements in real object depths for a variety of existing depth networks. Code and optimized background prompts can be found at: https://mbaradad.github.io/depth_prompt.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes