CVLGROOct 13, 2022

Self-Supervised Geometric Correspondence for Category-Level 6D Object Pose Estimation in the Wild

arXiv:2210.07199v331 citationsh-index: 81
Originality Highly original
AI Analysis

This addresses the challenge of generalizing 6D pose estimation to unseen instances in real-world scenarios, which is crucial for applications in computer vision and robotics, and it is a novel method for a known bottleneck.

The paper tackles the problem of category-level 6D object pose estimation in the wild by introducing a self-supervised learning approach that reconstructs canonical 3D shapes and learns dense correspondences via surface embedding, achieving on-par or better performance than supervised methods without human annotations or simulators.

While 6D object pose estimation has wide applications across computer vision and robotics, it remains far from being solved due to the lack of annotations. The problem becomes even more challenging when moving to category-level 6D pose, which requires generalization to unseen instances. Current approaches are restricted by leveraging annotations from simulation or collected from humans. In this paper, we overcome this barrier by introducing a self-supervised learning approach trained directly on large-scale real-world object videos for category-level 6D pose estimation in the wild. Our framework reconstructs the canonical 3D shape of an object category and learns dense correspondences between input images and the canonical shape via surface embedding. For training, we propose novel geometrical cycle-consistency losses which construct cycles across 2D-3D spaces, across different instances and different time steps. The learned correspondence can be applied for 6D pose estimation and other downstream tasks such as keypoint transfer. Surprisingly, our method, without any human annotations or simulators, can achieve on-par or even better performance than previous supervised or semi-supervised methods on in-the-wild images. Our project page is: https://kywind.github.io/self-pose .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes