MobileGeo: Exploring Hierarchical Knowledge Distillation for Resource-Efficient Cross-view Drone Geo-Localization
This addresses the challenge of deploying efficient drone geo-localization on mobile edge devices, which is critical for autonomous navigation in GNSS-denied environments, representing a strong specific gain rather than a foundational advancement.
The paper tackles the problem of resource-intensive cross-view geo-localization for drones by proposing MobileGeo, a mobile-friendly framework that achieves a 4.19% improvement in AP on the University-1652 dataset while being over 5x more efficient in FLOPs and 3x faster, running at 251.5 FPS on an edge device.
Cross-view geo-localization (CVGL) enables drone localization by matching aerial images to geo-tagged satellite databases, which is critical for autonomous navigation in GNSS-denied environments. However, existing methods rely on resource-intensive feature alignment and multi-branch architectures, incurring high inference costs that limit their deployment on mobile edge devices. We propose MobileGeo, a mobile-friendly framework designed for efficient on-device CVGL. MobileGeo achieves its efficiency through two key components: 1) During training, a Hierarchical Distillation (HD-CVGL) paradigm, coupled with Uncertainty-Aware Prediction Alignment (UAPA), distills essential information into a compact model without incurring inference overhead. 2) During inference, an efficient Multi-view Selection Refinement Module (MSRM) leverages mutual information to filter redundant views and reduce computational load. Extensive experiments demonstrate that MobileGeo outperforms previous state-of-the-art methods, achieving a 4.19\% improvement in AP on University-1652 dataset while being over 5$\times$ more efficient in FLOPs and 3$\times$ faster. Crucially, MobileGeo runs at 251.5 FPS on an NVIDIA AGX Orin edge device, demonstrating its practical viability for real-time on-device drone geo-localization.