CVROIVApr 27, 2020

Single Shot 6D Object Pose Estimation

arXiv:2004.12729v146 citations
Originality Incremental advance
AI Analysis

This provides a fast and accurate solution for robotics and augmented reality applications, though it is incremental as it builds on existing single-shot and regression-based approaches.

The paper tackles 6D object pose estimation from depth images by introducing a fast, single-shot method that regresses poses locally on volume elements, achieving 65 fps on a GPU and outperforming state-of-the-art methods on public benchmarks without needing real-world annotated data.

In this paper, we introduce a novel single shot approach for 6D object pose estimation of rigid objects based on depth images. For this purpose, a fully convolutional neural network is employed, where the 3D input data is spatially discretized and pose estimation is considered as a regression task that is solved locally on the resulting volume elements. With 65 fps on a GPU, our Object Pose Network (OP-Net) is extremely fast, is optimized end-to-end, and estimates the 6D pose of multiple objects in the image simultaneously. Our approach does not require manually 6D pose-annotated real-world datasets and transfers to the real world, although being entirely trained on synthetic data. The proposed method is evaluated on public benchmark datasets, where we can demonstrate that state-of-the-art methods are significantly outperformed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes