CVJul 20, 2016

Deep Learning of Local RGB-D Patches for 3D Object Detection and 6D Pose Estimation

arXiv:1607.06038v1291 citations
Originality Incremental advance
AI Analysis

This addresses the problem of accurate 3D object detection and pose estimation for robotics and augmented reality applications, presenting an incremental improvement over existing methods.

The paper tackles 3D object detection and 6D pose estimation by using a convolutional auto-encoder to regress descriptors from local RGB-D patches, which are matched against synthetic model views to cast votes. It shows robust detection results that compete with and surpass state-of-the-art methods on three datasets while being scalable to multiple objects.

We present a 3D object detection method that uses regressed descriptors of locally-sampled RGB-D patches for 6D vote casting. For regression, we employ a convolutional auto-encoder that has been trained on a large collection of random local patches. During testing, scene patch descriptors are matched against a database of synthetic model view patches and cast 6D object votes which are subsequently filtered to refined hypotheses. We evaluate on three datasets to show that our method generalizes well to previously unseen input data, delivers robust detection results that compete with and surpass the state-of-the-art while being scalable in the number of objects.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes