ROJan 24, 2018

UAV Visual Teach and Repeat Using Only Semantic Object Features

arXiv:1801.07899v19 citations
Originality Incremental advance
AI Analysis

This work addresses robust navigation for UAVs in dynamic environments, but it is incremental as it adapts existing VTR methods with semantic features.

The paper tackled the problem of Visual Teach and Repeat (VTR) for UAV navigation by using semantic object detections as landmarks, demonstrating that this approach can handle lighting changes and object movements without relying on low-level image features.

We demonstrate the use of semantic object detections as robust features for Visual Teach and Repeat (VTR). Recent CNN-based object detectors are able to reliably detect objects of tens or hundreds of categories in a video at frame rates. We show that such detections are repeatable enough to use as landmarks for VTR, without any low-level image features. Since object detections are highly invariant to lighting and surface appearance changes, our VTR can cope with global lighting changes and local movements of the landmark objects. In the teaching phase, we build a series of compact scene descriptors: a list of detected object labels and their image-plane locations. In the repeating phase, we use Seq-SLAM-like relocalization to identify the most similar learned scene, then use a motion control algorithm based on the funnel lane theory to navigate the robot along the previously piloted trajectory. We evaluate the method on a commodity UAV, examining the robustness of the algorithm to new viewpoints, lighting conditions, and movements of landmark objects. The results suggest that semantic object features could be useful due to their invariance to superficial appearance changes compared to low-level image features.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes