LGMay 30, 2022

TaSIL: Taylor Series Imitation Learning

arXiv:2205.14812v223 citationsh-index: 30
Originality Incremental advance
AI Analysis

This addresses the challenge of robust policy imitation in robotics and control systems, offering an incremental improvement over existing methods.

The paper tackles the problem of imitation learning in continuous control by proposing TaSIL, which augments behavior cloning losses with penalties on higher-order Taylor series deviations between learned and expert policies, showing significant improvement over baselines across MuJoCo tasks.

We propose Taylor Series Imitation Learning (TaSIL), a simple augmentation to standard behavior cloning losses in the context of continuous control. TaSIL penalizes deviations in the higher-order Taylor series terms between the learned and expert policies. We show that experts satisfying a notion of $\textit{incremental input-to-state stability}$ are easy to learn, in the sense that a small TaSIL-augmented imitation loss over expert trajectories guarantees a small imitation loss over trajectories generated by the learned policy. We provide sample-complexity bounds for TaSIL that scale as $\tilde{\mathcal{O}}(1/n)$ in the realizable setting, for $n$ the number of expert demonstrations. Finally, we demonstrate experimentally the relationship between the robustness of the expert policy and the order of Taylor expansion required in TaSIL, and compare standard Behavior Cloning, DART, and DAgger with TaSIL-loss-augmented variants. In all cases, we show significant improvement over baselines across a variety of MuJoCo tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes