ROLGJun 12, 2025

Demonstration Sidetracks: Categorizing Systematic Non-Optimality in Human Demonstrations

arXiv:2506.11262v13 citationsh-index: 11Has CodeRO-MAN
Originality Incremental advance
AI Analysis

This addresses the problem of imperfect demonstrations for robot learning from demonstration, which is incremental as it categorizes existing issues rather than solving them.

The paper studied non-optimal behaviors in human demonstrations for robot learning, showing they are systematic rather than random noise, and identified four types of sidetracks that appear frequently across 40 participants in a long-horizon task.

Learning from Demonstration (LfD) is a popular approach for robots to acquire new skills, but most LfD methods suffer from imperfections in human demonstrations. Prior work typically treats these suboptimalities as random noise. In this paper we study non-optimal behaviors in non-expert demonstrations and show that they are systematic, forming what we call demonstration sidetracks. Using a public space study with 40 participants performing a long-horizon robot task, we recreated the setup in simulation and annotated all demonstrations. We identify four types of sidetracks (Exploration, Mistake, Alignment, Pause) and one control pattern (one-dimension control). Sidetracks appear frequently across participants, and their temporal and spatial distribution is tied to task context. We also find that users' control patterns depend on the control interface. These insights point to the need for better models of suboptimal demonstrations to improve LfD algorithms and bridge the gap between lab training and real-world deployment. All demonstrations, infrastructure, and annotations are available at https://github.com/AABL-Lab/Human-Demonstration-Sidetracks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes