Evaluation of Cross-View Matching to Improve Ground Vehicle Localization with Aerial Perception
This work addresses localization for autonomous ground vehicles by improving trajectory-based evaluation, but it is incremental as it builds on existing cross-view matching methods.
The paper tackled the problem of localizing a ground vehicle over longer trajectories by evaluating cross-view matching techniques, treating matches as sensor measurements fused with a particle filter, and reported results on simulated and real-world datasets, including variations in parameters like aerial image height and camera pitch.
Cross-view matching refers to the problem of finding the closest match for a given query ground view image to one from a database of aerial images. If the aerial images are geotagged, then the closest matching aerial image can be used to localize the query ground view image. Due to the recent success of deep learning methods, several cross-view matching techniques have been proposed. These approaches perform well for the matching of isolated query images. However, their evaluation over a trajectory is limited. In this paper, we evaluate cross-view matching for the task of localizing a ground vehicle over a longer trajectory. We treat these cross-view matches as sensor measurements that are fused using a particle filter. We evaluate the performance of this method using a city-wide dataset collected in a photorealistic simulation by varying four parameters: height of aerial images, the pitch of the aerial camera mount, FOV of the ground camera, and the methodology of fusing cross-view measurements in the particle filter. We also report the results obtained using our pipeline on a real-world dataset collected using Google Street View and satellite view APIs.