AIROSep 27, 2019

Zero-shot Imitation Learning from Demonstrations for Legged Robot Visual Navigation

arXiv:1909.12971v227 citations
AI Analysis

This work addresses the problem of costly data collection for legged robot navigation, offering a more efficient solution for robotics researchers and practitioners, but it is incremental as it builds on existing imitation learning techniques.

The paper tackles the challenge of training visual navigation policies for legged robots by proposing a zero-shot imitation learning approach that uses human demonstrations from third-person perspectives, enabling effective navigation without requiring robot-specific expert data. The method achieves successful policy learning on the Laikago robot in both simulated and real-world environments, though no concrete performance numbers are provided.

Imitation learning is a popular approach for training visual navigation policies. However, collecting expert demonstrations for legged robots is challenging as these robots can be hard to control, move slowly, and cannot operate continuously for a long time. Here, we propose a zero-shot imitation learning approach for training a visual navigation policy on legged robots from human (third-person perspective) demonstrations, enabling high-quality navigation and cost-effective data collection. However, imitation learning from third-person demonstrations raises unique challenges. First, these demonstrations are captured from different camera perspectives, which we address via a feature disentanglement network (FDN) that extracts perspective-invariant state features. Second, as transition dynamics vary across systems, we label missing actions by either building an inverse model of the robot's dynamics in the feature space and applying it to the human demonstrations or developing a Graphic User Interface(GUI) to label human demonstrations. To train a navigation policy we use a model-based imitation learning approach with FDN and labeled human demonstrations. We show that our framework can learn an effective policy for a legged robot, Laikago, from human demonstrations in both simulated and real-world environments. Our approach is zero-shot as the robot never navigates the same paths during training as those at testing time. We justify our framework by performing a comparative study.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes