RaCo: Ranking and Covariance for Practical Learned Keypoints
This work addresses the need for practical and interpretable keypoints in 3D computer vision tasks, offering a simple and effective solution without requiring covisible image pairs.
The paper tackles the problem of learning robust and versatile keypoints for 3D computer vision by introducing RaCo, a lightweight neural network that integrates a repeatable detector, differentiable ranker, and covariance estimator, achieving state-of-the-art performance in keypoint repeatability and two-view matching, especially under large in-plane rotations.
This paper introduces RaCo, a lightweight neural network designed to learn robust and versatile keypoints suitable for a variety of 3D computer vision tasks. The model integrates three key components: the repeatable keypoint detector, a differentiable ranker to maximize matches with a limited number of keypoints, and a covariance estimator to quantify spatial uncertainty in metric scale. Trained on perspective image crops only, RaCo operates without the need for covisible image pairs. It achieves strong rotational robustness through extensive data augmentation, even without the use of computationally expensive equivariant network architectures. The method is evaluated on several challenging datasets, where it demonstrates state-of-the-art performance in keypoint repeatability and two-view matching, particularly under large in-plane rotations. Ultimately, RaCo provides an effective and simple strategy to independently estimate keypoint ranking and metric covariance without additional labels, detecting interpretable and repeatable interest points. The code is available at https://github.com/cvg/RaCo.