SUES-200: A Multi-height Multi-scene Cross-view Image Benchmark Across Drone and Satellite
This addresses a gap in public datasets for drone-based cross-view matching, enabling better assessment of model adaptability to complex scenes, though it is incremental as it focuses on dataset creation.
The authors tackled the problem of cross-view image matching for drone navigation by introducing SUES-200, a dataset with 24,120 images from drones at four heights and corresponding satellite views, which helps models learn height-discriminative features.
Cross-view image matching aims to match images of the same target scene acquired from different platforms. With the rapid development of drone technology, cross-view matching by neural network models has been a widely accepted choice for drone position or navigation. However, existing public datasets do not include images obtained by drones at different heights, and the types of scenes are relatively homogeneous, which yields issues in assessing a model's capability to adapt to complex and changing scenes. In this end, we present a new cross-view dataset called SUES-200 to address these issues. SUES-200 contains 24120 images acquired by the drone at four different heights and corresponding satellite view images of the same target scene. To the best of our knowledge, SUES-200 is the first public dataset that considers the differences generated in aerial photography captured by drones flying at different heights. In addition, we developed an evaluation for efficient training, testing and evaluation of cross-view matching models, under which we comprehensively analyze the performance of nine architectures. Then, we propose a robust baseline model for use with SUES-200. Experimental results show that SUES-200 can help the model to learn highly discriminative features of the height of the drone.