ROCVSep 28, 2017

X-View: Graph-Based Semantic Multi-View Localization

arXiv:1709.09905v3178 citations
Originality Incremental advance
AI Analysis

This addresses global localization for robots in human-made environments, offering a novel method to handle viewpoint variations, though it is incremental in improving over existing approaches.

The paper tackles the problem of global registration of multi-view robot data under drastic viewpoint changes by using semantic graph descriptor matching, achieving up to 85% accuracy compared to 75% for baseline appearance-based methods.

Global registration of multi-view robot data is a challenging task. Appearance-based global localization approaches often fail under drastic view-point changes, as representations have limited view-point invariance. This work is based on the idea that human-made environments contain rich semantics which can be used to disambiguate global localization. Here, we present X-View, a Multi-View Semantic Global Localization system. X-View leverages semantic graph descriptor matching for global localization, enabling localization under drastically different view-points. While the approach is general in terms of the semantic input data, we present and evaluate an implementation on visual data. We demonstrate the system in experiments on the publicly available SYNTHIA dataset, on a realistic urban dataset recorded with a simulator, and on real-world StreetView data. Our findings show that X-View is able to globally localize aerial-to-ground, and ground-to-ground robot data of drastically different view-points. Our approach achieves an accuracy of up to 85 % on global localizations in the multi-view case, while the benchmarked baseline appearance-based methods reach up to 75 %.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes