RO AIJul 29, 2025

Model Predictive Adversarial Imitation Learning for Planning from Observation

Tyler Han, Yanda Bao, Bhaumik Mehta, Gabriel Guo, Anubhav Vishwakarma, Emily Kang, Sanghun Jung, Rosario Scalise, Jason Zhou, Bryan Xu, Byron Boots

arXiv:2507.21533v12 citationsh-index: 4

Originality Incremental advance

AI Analysis

This work addresses the challenge of reliable planning from observation-only demonstrations for robotics and control applications, representing an incremental advancement by integrating existing methods.

The paper tackles the problem of learning to plan from ambiguous and incomplete human demonstrations by unifying inverse reinforcement learning with model predictive control, resulting in significant improvements in sample efficiency, out-of-distribution generalization, and robustness in simulated and real-world experiments.

Human demonstration data is often ambiguous and incomplete, motivating imitation learning approaches that also exhibit reliable planning behavior. A common paradigm to perform planning-from-demonstration involves learning a reward function via Inverse Reinforcement Learning (IRL) then deploying this reward via Model Predictive Control (MPC). Towards unifying these methods, we derive a replacement of the policy in IRL with a planning-based agent. With connections to Adversarial Imitation Learning, this formulation enables end-to-end interactive learning of planners from observation-only demonstrations. In addition to benefits in interpretability, complexity, and safety, we study and observe significant improvements on sample efficiency, out-of-distribution generalization, and robustness. The study includes evaluations in both simulated control benchmarks and real-world navigation experiments using few-to-single observation-only demonstrations.

View on arXiv PDF

Similar