ReF -- Rotation Equivariant Features for Local Feature Matching
This work addresses rotation invariance in local feature matching for applications like visual place recognition, representing an incremental improvement over existing methods.
The paper tackled the problem of improving rotation robustness in sparse local feature matching for computer vision and robotics by proposing a method that combines rotation-specific features from steerable CNNs with augmentation-trained standard CNNs, achieving state-of-the-art performance on benchmarks like HPatches and UrbanScenes3D-Air.
Sparse local feature matching is pivotal for many computer vision and robotics tasks. To improve their invariance to challenging appearance conditions and viewing angles, and hence their usefulness, existing learning-based methods have primarily focused on data augmentation-based training. In this work, we propose an alternative, complementary approach that centers on inducing bias in the model architecture itself to generate `rotation-specific' features using Steerable E2-CNNs, that are then group-pooled to achieve rotation-invariant local features. We demonstrate that this high performance, rotation-specific coverage from the steerable CNNs can be expanded to all rotation angles by combining it with augmentation-trained standard CNNs which have broader coverage but are often inaccurate, thus creating a state-of-the-art rotation-robust local feature matcher. We benchmark our proposed methods against existing techniques on HPatches and a newly proposed UrbanScenes3D-Air dataset for visual place recognition. Furthermore, we present a detailed analysis of the performance effects of ensembling, robust estimation, network architecture variations, and the use of rotation priors.