LGROMar 10, 2021

Learning from Imperfect Demonstrations from Agents with Varying Dynamics

arXiv:2103.05910v138 citations
AI Analysis

This addresses the problem of imperfect demonstrations for robotics, but it is incremental as it builds on prior imitation learning methods.

The paper tackles imitation learning with sub-optimal or varying-dynamics demonstrations by developing a metric to assess demonstration usefulness, resulting in improved policies with higher expected returns in simulated and real robot experiments.

Imitation learning enables robots to learn from demonstrations. Previous imitation learning algorithms usually assume access to optimal expert demonstrations. However, in many real-world applications, this assumption is limiting. Most collected demonstrations are not optimal or are produced by an agent with slightly different dynamics. We therefore address the problem of imitation learning when the demonstrations can be sub-optimal or be drawn from agents with varying dynamics. We develop a metric composed of a feasibility score and an optimality score to measure how useful a demonstration is for imitation learning. The proposed score enables learning from more informative demonstrations, and disregarding the less relevant demonstrations. Our experiments on four environments in simulation and on a real robot show improved learned policies with higher expected return.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes