CVROMar 14, 2017

6-DoF Object Pose from Semantic Keypoints

arXiv:1703.04670v1422 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of pose estimation for both textured and textureless objects in cluttered backgrounds, with applications in robotics and computer vision, though it builds incrementally on existing keypoint and shape model methods.

The paper tackles the problem of estimating 6-DoF object pose from a single RGB image by combining semantic keypoints with a deformable shape model, achieving state-of-the-art accuracy on the PASCAL3D+ dataset for class-based scenarios.

This paper presents a novel approach to estimating the continuous six degree of freedom (6-DoF) pose (3D translation and rotation) of an object from a single RGB image. The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model. Unlike prior work, we are agnostic to whether the object is textured or textureless, as the convnet learns the optimal representation from the available training image data. Furthermore, the approach can be applied to instance- and class-based pose recovery. Empirically, we show that the proposed approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios with a cluttered background. For class-based object pose estimation, state-of-the-art accuracy is shown on the large-scale PASCAL3D+ dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes