CVNov 20, 2018

Double Refinement Network for Efficient Indoor Monocular Depth Estimation

arXiv:1811.08466v212 citations
Originality Highly original
AI Analysis

This work addresses efficiency bottlenecks in indoor monocular depth estimation for computer vision applications, offering a significant speedup without accuracy loss.

The paper tackles the problem of inefficient memory and time usage in state-of-the-art monocular depth estimation methods by introducing the Double Refinement Network, achieving state-of-the-art accuracy on the NYU Depth v2 dataset with up to 18 times speedup and lower RAM usage per image.

Monocular depth estimation is the task of obtaining a measure of distance for each pixel using a single image. It is an important problem in computer vision and is usually solved using neural networks. Though recent works in this area have shown significant improvement in accuracy, the state-of-the-art methods tend to require massive amounts of memory and time to process an image. The main purpose of this work is to improve the performance of the latest solutions with no decrease in accuracy. To this end, we introduce the Double Refinement Network architecture. The proposed method achieves state-of-the-art results on the standard benchmark RGB-D dataset NYU Depth v2, while its frames per second rate is significantly higher (up to 18 times speedup per image at batch size 1) and the RAM usage per image is lower.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes