ROOct 11, 2016

Attention and Anticipation in Fast Visual-Inertial Navigation

arXiv:1610.03344v388 citations
AI Analysis

This work addresses computational efficiency for agile drones in visual-inertial navigation, though it is incremental as it builds on existing methods with a novel selection approach.

The paper tackles the problem of selecting the most relevant visual cues for visual-inertial navigation under computational constraints, resulting in a greedy algorithm that ensures state-of-the-art performance with lean processing time, outperforming appearance-based methods in easy scenarios and enabling accurate navigation in challenging ones.

We study a Visual-Inertial Navigation (VIN) problem in which a robot needs to estimate its state using an on-board camera and an inertial sensor, without any prior knowledge of the external environment. We consider the case in which the robot can allocate limited resources to VIN, due to tight computational constraints. Therefore, we answer the following question: under limited resources, what are the most relevant visual cues to maximize the performance of visual-inertial navigation? Our approach has four key ingredients. First, it is task-driven, in that the selection of the visual cues is guided by a metric quantifying the VIN performance. Second, it exploits the notion of anticipation, since it uses a simplified model for forward-simulation of robot dynamics, predicting the utility of a set of visual cues over a future time horizon. Third, it is efficient and easy to implement, since it leads to a greedy algorithm for the selection of the most relevant visual cues. Fourth, it provides formal performance guarantees: we leverage submodularity to prove that the greedy selection cannot be far from the optimal (combinatorial) selection. Simulations and real experiments on agile drones show that our approach ensures state-of-the-art VIN performance while maintaining a lean processing time. In the easy scenarios, our approach outperforms appearance-based feature selection in terms of localization errors. In the most challenging scenarios, it enables accurate visual-inertial navigation while appearance-based feature selection fails to track robot's motion during aggressive maneuvers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes