CVAug 26, 2022

Perspective-1-Ellipsoid: Formulation, Analysis and Solutions of the Camera Pose Estimation Problem from One Ellipse-Ellipsoid Correspondence

arXiv:2208.12513v39 citationsh-index: 26
Originality Incremental advance
AI Analysis

This work addresses a mathematical inefficiency in computer vision for pose estimation from ellipsoid-ellipse correspondences, which is incremental but could improve efficiency in applications like object detection.

The paper tackles camera pose estimation by introducing a new ellipsoid-specific theoretical framework that reduces the problem to a 1 Degree-of-Freedom formulation, enabling closed-form solutions for the remaining unknowns.

In computer vision, camera pose estimation from correspondences between 3D geometric entities and their projections into the image has been a widely investigated problem. Although most state-of-the-art methods exploit low-level primitives such as points or lines, the emergence of very effective CNN-based object detectors in the recent years has paved the way to the use of higher-level features carrying semantically meaningful information. Pioneering works in that direction have shown that modelling 3D objects by ellipsoids and 2D detections by ellipses offers a convenient manner to link 2D and 3D data. However, the mathematical formalism most often used in the related litterature does not enable to easily distinguish ellipsoids and ellipses from other quadrics and conics, leading to a loss of specificity potentially detrimental in some developments. Moreover, the linearization process of the projection equation creates an over-representation of the camera parameters, also possibly causing an efficiency loss. In this paper, we therefore introduce an ellipsoid-specific theoretical framework and demonstrate its beneficial properties in the context of pose estimation. More precisely, we first show that the proposed formalism enables to reduce the pose estimation problem to a position or orientation-only estimation problem in which the remaining unknowns can be derived in closed-form. Then, we demonstrate that it can be further reduced to a 1 Degree-of-Freedom (1DoF) problem and provide the analytical derivations of the pose as a function of that unique scalar unknown. We illustrate our theoretical considerations by visual examples and include a discussion on the practical aspects. Finally, we release this paper along with the corresponding source code in order to contribute towards more efficient resolutions of ellipsoid-related pose estimation problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes