CVApr 22, 2022

Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images

arXiv:2204.10776v2203 citationsh-index: 45
Originality Highly original
AI Analysis

This addresses the limitation of existing generalizable pose estimators that rely on high-quality models or extra data, potentially enabling broader applications in robotics and AR/VR.

The paper tackles the problem of 6-DoF object pose estimation from RGB images without requiring 3D models or additional depth/mask inputs, achieving state-of-the-art results on model-free datasets like MOPED and GenMOP, and competitive performance on LINEMOD compared to instance-specific methods.

In this paper, we present a generalizable model-free 6-DoF object pose estimator called Gen6D. Existing generalizable pose estimators either need high-quality object models or require additional depth maps or object masks in test time, which significantly limits their application scope. In contrast, our pose estimator only requires some posed images of the unseen object and is able to accurately predict the poses of the object in arbitrary environments. Gen6D consists of an object detector, a viewpoint selector and a pose refiner, all of which do not require the 3D object model and can generalize to unseen objects. Experiments show that Gen6D achieves state-of-the-art results on two model-free datasets: the MOPED dataset and a new GenMOP dataset collected by us. In addition, on the LINEMOD dataset, Gen6D achieves competitive results compared with instance-specific pose estimators. Project page: https://liuyuan-pal.github.io/Gen6D/.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes