CVNov 28, 2020

AdaBins: Depth Estimation using Adaptive Bins

arXiv:2011.14141v11159 citations
AI Analysis

This work provides a significant improvement in single-image depth estimation for computer vision applications, potentially benefiting areas like robotics and 3D reconstruction.

This paper addresses the problem of estimating a high-quality dense depth map from a single RGB image. The authors propose a transformer-based architecture block, AdaBins, which adaptively estimates depth bin centers per image and then linearly combines them to predict final depth values, achieving decisive improvement over state-of-the-art on several popular depth datasets across all metrics.

We address the problem of estimating a high quality dense depth map from a single RGB input image. We start out with a baseline encoder-decoder convolutional neural network architecture and pose the question of how the global processing of information can help improve overall depth estimation. To this end, we propose a transformer-based architecture block that divides the depth range into bins whose center value is estimated adaptively per image. The final depth values are estimated as linear combinations of the bin centers. We call our new building block AdaBins. Our results show a decisive improvement over the state-of-the-art on several popular depth datasets across all metrics. We also validate the effectiveness of the proposed block with an ablation study and provide the code and corresponding pre-trained weights of the new state-of-the-art model.

Code Implementations11 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes