Robust Offline Imitation Learning Through State-level Trajectory Stitching
This work addresses the challenge of covariate shift and data scarcity in imitation learning for robotics, representing an incremental improvement over existing offline methods.
The paper tackles the problem of imitation learning's reliance on high-quality expert data by proposing a state-based search framework that stitches trajectory fragments from mixed-quality datasets, resulting in improved generalization and performance on standard benchmarks and real-world robotic tasks.
Imitation learning (IL) has proven effective for enabling robots to acquire visuomotor skills through expert demonstrations. However, traditional IL methods are limited by their reliance on high-quality, often scarce, expert data, and suffer from covariate shift. To address these challenges, recent advances in offline IL have incorporated suboptimal, unlabeled datasets into the training. In this paper, we propose a novel approach to enhance policy learning from mixed-quality offline datasets by leveraging task-relevant trajectory fragments and rich environmental dynamics. Specifically, we introduce a state-based search framework that stitches state-action pairs from imperfect demonstrations, generating more diverse and informative training trajectories. Experimental results on standard IL benchmarks and real-world robotic tasks showcase that our proposed method significantly improves both generalization and performance.