CVMar 8

Scale-Aware UAV-to-Satellite Cross-View Geo-Localization: A Semantic Geometric Approach

arXiv:2603.07535v1
Predicted impact top 42% in CV · last 90 daysOriginality Highly original
AI Analysis

This work improves the robustness of UAV-to-satellite geo-localization for UAV navigation and target localization by overcoming scale ambiguity, a common real-world challenge for existing methods.

This paper addresses the problem of cross-view geo-localization (CVGL) between UAV and satellite images when the UAV image scale is unknown. It proposes a geometric framework that uses small vehicles as semantic anchors to estimate the absolute metric scale from monocular UAV images, which then guides scale-adaptive satellite image cropping to improve feature alignment.

Cross-View Geo-Localization (CVGL) between UAV imagery and satellite images plays a crucial role in target localization and UAV self-positioning. However, most existing methods rely on the idealized assumption of scale consistency between UAV queries and satellite galleries, overlooking the severe scale ambiguity commonly encountered in real-world scenarios. This discrepancy leads to field-of-view misalignment and feature mismatch, significantly degrading CVGL robustness. To address this issue, we propose a geometric framework that recovers the absolute metric scale from monocular UAV images using semantic anchors. Specifically, small vehicles (SVs), characterized by relatively stable prior size distributions and high detectability, are exploited as metric references. A Decoupled Stereoscopic Projection Model is introduced to estimate the absolute image scale from these semantic targets. By decomposing vehicle dimensions into radial and tangential components, the model compensates for perspective distortions in 2D detections of 3D vehicles, enabling more accurate scale estimation. To further reduce intra-class size variation and detection noise, a dual-dimension fusion strategy with Interquartile Range (IQR)-based robust aggregation is employed. The estimated global scale is then used as a physical constraint for scale-adaptive satellite image cropping, improving UAV-to-satellite feature alignment. Experiments on augmented DenseUAV and UAV-VisLoc datasets demonstrate that the proposed method significantly improves CVGL robustness under unknown UAV image scales. Additionally, the framework shows strong potential for downstream applications such as passive UAV altitude estimation and 3D model scale recovery.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes