Multi-View Optimization of Local Feature Geometry
This addresses the limitation of single-view local feature detection, which negatively impacts downstream tasks like Structure-from-Motion for computer vision applications, though it is incremental as it complements existing methods.
The paper tackles the problem of refining local image feature geometry from multiple views without known scene or camera geometry, resulting in consistent improvements in triangulation and camera localization performance for both hand-crafted and learned features.
In this work, we address the problem of refining the geometry of local image features from multiple views without known scene or camera geometry. Current approaches to local feature detection are inherently limited in their keypoint localization accuracy because they only operate on a single view. This limitation has a negative impact on downstream tasks such as Structure-from-Motion, where inaccurate keypoints lead to large errors in triangulation and camera localization. Our proposed method naturally complements the traditional feature extraction and matching paradigm. We first estimate local geometric transformations between tentative matches and then optimize the keypoint locations over multiple views jointly according to a non-linear least squares formulation. Throughout a variety of experiments, we show that our method consistently improves the triangulation and camera localization performance for both hand-crafted and learned local features.