HyperPose: Hypernetwork-Infused Camera Pose Localization and an Extended Cambridge Landmarks Dataset
This work addresses domain gap issues in camera pose localization for computer vision applications, but it is incremental as it builds on existing regression architectures.
The paper tackles the problem of camera pose localization degradation due to domain disparities from environmental variations by proposing HyperPose, which uses hyper-networks to dynamically adapt weights, resulting in notable performance enhancements on indoor and outdoor datasets.
In this work, we propose HyperPose, which utilizes hyper-networks in absolute camera pose regressors. The inherent appearance variations in natural scenes, attributable to environmental conditions, perspective, and lighting, induce a significant domain disparity between the training and test datasets. This disparity degrades the precision of contemporary localization networks. To mitigate this, we advocate for incorporating hypernetworks into single-scene and multiscene camera pose regression models. During inference, the hypernetwork dynamically computes adaptive weights for the localization regression heads based on the particular input image, effectively narrowing the domain gap. Using indoor and outdoor datasets, we evaluate the HyperPose methodology across multiple established absolute pose regression architectures. We also introduce and share the Extended Cambridge Landmarks (ECL), a novel localization dataset, based on the Cambridge Landmarks dataset, showing it in multiple seasons with significantly varying appearance conditions. Our empirical experiments demonstrate that HyperPose yields notable performance enhancements for single- and multi-scene architectures. We have made our source code, pre-trained models, and the ECL dataset openly available.