CVRODec 19, 2019

P$^2$GNet: Pose-Guided Point Cloud Generating Networks for 6-DoF Object Pose Estimation

arXiv:1912.09316v24 citations
Originality Highly original
AI Analysis

This addresses the challenge of accurate and robust pose estimation in cluttered scenes for robotics and computer vision applications, representing a novel method rather than an incremental improvement.

The paper tackles the problem of 6-DoF object pose estimation under occlusion by proposing P$^2$GNet, which leverages object model priors through an end-to-end estimation-by-generation workflow, achieving state-of-the-art performance with real-time inference on benchmarks like YCB-Video and LineMOD.

Humans are able to perform fast and accurate object pose estimation even under severe occlusion by exploiting learned object model priors from everyday life. However, most recently proposed pose estimation algorithms neglect to utilize the information of object models, often end up with limited accuracy, and tend to fall short in cluttered scenes. In this paper, we present a novel learning-based model, \underline{P}ose-Guided \underline{P}oint Cloud \underline{G}enerating Networks for 6D Object Pose Estimation (P$^2$GNet), designed to effectively exploit object model priors to facilitate 6D object pose estimation. We achieve this with an end-to-end estimation-by-generation workflow that combines the appearance information from the RGB-D image and the structure knowledge from object point cloud to enable accurate and robust pose estimation. Experiments on two commonly used benchmarks for 6D pose estimation, YCB-Video dataset and LineMOD dataset, demonstrate that P$^2$GNet outperforms the state-of-the-art method by a large margin and shows marked robustness towards heavy occlusion, while achieving real-time inference.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes