CVMar 10, 2025

DaD: Distilled Reinforcement Learning for Diverse Keypoint Detection

Johan Edstedt, Georg Bökman, Mårten Wadenbäck, Michael Felsberg

arXiv:2503.07347v213.16 citationsh-index: 9Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of keypoint detection for scalable Structure-from-Motion systems, offering a novel descriptor-free method that is incremental in its refinement of detector types.

The paper tackles the problem of designing a self-supervised, descriptor-free objective for keypoint detection in Structure-from-Motion systems by proposing a reinforcement learning approach with a balanced top-K sampling strategy and a third detector (DaD) that optimizes the Kullback-Leibler divergence of light and dark detectors, resulting in significant improvements over state-of-the-art across multiple benchmarks.

Keypoints are what enable Structure-from-Motion (SfM) systems to scale to thousands of images. However, designing a keypoint detection objective is a non-trivial task, as SfM is non-differentiable. Typically, an auxiliary objective involving a descriptor is optimized. This however induces a dependency on the descriptor, which is undesirable. In this paper we propose a fully self-supervised and descriptor-free objective for keypoint detection, through reinforcement learning. To ensure training does not degenerate, we leverage a balanced top-K sampling strategy. While this already produces competitive models, we find that two qualitatively different types of detectors emerge, which are only able to detect light and dark keypoints respectively. To remedy this, we train a third detector, DaD, that optimizes the Kullback-Leibler divergence of the pointwise maximum of both light and dark detectors. Our approach significantly improve upon SotA across a range of benchmarks. Code and model weights are publicly available at https://github.com/parskatt/dad

View on arXiv PDF Code

Similar