CVJul 19, 2022

PoserNet: Refining Relative Camera Poses Exploiting Object Detections

arXiv:2207.09445v25 citationsh-index: 35Has Code
AI Analysis

This work addresses camera pose estimation for computer vision applications, presenting an incremental improvement over existing methods.

The paper tackles the problem of refining relative camera poses in multi-view images by using objectness regions instead of explicit semantic object detections, resulting in a 62-degree improvement in median rotation error on the 7-Scenes dataset.

The estimation of the camera poses associated with a set of images commonly relies on feature matches between the images. In contrast, we are the first to address this challenge by using objectness regions to guide the pose estimation problem rather than explicit semantic object detections. We propose Pose Refiner Network (PoserNet) a light-weight Graph Neural Network to refine the approximate pair-wise relative camera poses. PoserNet exploits associations between the objectness regions - concisely expressed as bounding boxes - across multiple views to globally refine sparsely connected view graphs. We evaluate on the 7-Scenes dataset across varied sizes of graphs and show how this process can be beneficial to optimisation-based Motion Averaging algorithms improving the median error on the rotation by 62 degrees with respect to the initial estimates obtained based on bounding boxes. Code and data are available at https://github.com/IIT-PAVIS/PoserNet.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes