Level Set-Based Camera Pose Estimation From Multiple 2D/3D Ellipse-Ellipsoid Correspondences
This work addresses camera localization in robotics or AR by improving object-based methods, but it is incremental as it builds on existing ellipsoidal mapping approaches.
The paper tackles camera pose estimation from a single RGB image using ellipsoidal object models, developing a level set-based cost function for ellipse-ellipse correspondences that handles partial visibility and incorporates predictive uncertainty to improve pose accuracy.
In this paper, we propose an object-based camera pose estimation from a single RGB image and a pre-built map of objects, represented with ellipsoidal models. We show that contrary to point correspondences, the definition of a cost function characterizing the projection of a 3D object onto a 2D object detection is not straightforward. We develop an ellipse-ellipse cost based on level sets sampling, demonstrate its nice properties for handling partially visible objects and compare its performance with other common metrics. Finally, we show that the use of a predictive uncertainty on the detected ellipses allows a fair weighting of the contribution of the correspondences which improves the computed pose. The code is released at https://gitlab.inria.fr/tangram/level-set-based-camera-pose-estimation.