CVROMar 16, 2022

Fusing Local Similarities for Retrieval-based 3D Orientation Estimation of Unseen Objects

arXiv:2203.08472v228 citationsh-index: 67
AI Analysis

It addresses the challenge of 3D orientation estimation for unseen objects, which is important for robotics and AR applications, but is incremental as it builds on retrieval-based methods.

The paper tackles the problem of estimating 3D orientation from monocular images for unseen objects, using a retrieval-based strategy with local similarities and adaptive fusion, achieving significantly better generalization on datasets like LineMOD and T-LESS.

In this paper, we tackle the task of estimating the 3D orientation of previously-unseen objects from monocular images. This task contrasts with the one considered by most existing deep learning methods which typically assume that the testing objects have been observed during training. To handle the unseen objects, we follow a retrieval-based strategy and prevent the network from learning object-specific features by computing multi-scale local similarities between the query image and synthetically-generated reference images. We then introduce an adaptive fusion module that robustly aggregates the local similarities into a global similarity score of pairwise images. Furthermore, we speed up the retrieval process by developing a fast retrieval strategy. Our experiments on the LineMOD, LineMOD-Occluded, and T-LESS datasets show that our method yields a significantly better generalization to unseen objects than previous works. Our code and pre-trained models are available at https://sailor-z.github.io/projects/Unseen_Object_Pose.html.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes