Distillation Improves Visual Place Recognition for Low Quality Images
This work addresses the problem of reliable place recognition in resource-constrained environments, such as real-time visual localization with limited bandwidth, and is incremental as it applies existing distillation techniques to a specific bottleneck.
The paper tackles the problem of reduced visual place recognition (VPR) accuracy due to image-quality degradation from network bandwidth constraints, by using knowledge distillation to learn feature representations from high-quality images for extracting more discriminative descriptors from low-quality images, resulting in significant improvements in VPR recall rates under JPEG compression, resolution reduction, and video quantization.
Real-time visual localization often utilizes online computing, for which query images or videos are transmitted to remote servers for visual place recognition (VPR). However, limited network bandwidth necessitates image-quality reduction and thus the degradation of global image descriptors, reducing VPR accuracy. We address this issue at the descriptor extraction level with a knowledge-distillation methodology that learns feature representations from high-quality images to extract more discriminative descriptors from low-quality images. Our approach includes the Inter-channel Correlation Knowledge Distillation (ICKD) loss, Mean Squared Error (MSE) loss, and Triplet loss. We validate the proposed losses on multiple VPR methods and datasets subjected to JPEG compression, resolution reduction, and video quantization. We obtain significant improvements in VPR recall rates under all three tested modalities of lowered image quality. Furthermore, we fill a gap in VPR literature on video-based data and its influence on VPR performance. This work contributes to more reliable place recognition in resource-constrained environments.