Coarse-to-Fine Imitation Learning: Robot Manipulation from a Single Demonstration
This addresses the challenge of efficient imitation learning for robot manipulation, though it is incremental as it builds on existing state estimation and trajectory modeling approaches.
The paper tackles the problem of enabling robots to learn manipulation tasks from a single human demonstration without prior object knowledge, achieving success on 8 everyday tasks with a stable and interpretable controller.
We introduce a simple new method for visual imitation learning, which allows a novel robot manipulation task to be learned from a single human demonstration, without requiring any prior knowledge of the object being interacted with. Our method models imitation learning as a state estimation problem, with the state defined as the end-effector's pose at the point where object interaction begins, as observed from the demonstration. By then modelling a manipulation task as a coarse, approach trajectory followed by a fine, interaction trajectory, this state estimator can be trained in a self-supervised manner, by automatically moving the end-effector's camera around the object. At test time, the end-effector moves to the estimated state through a linear path, at which point the original demonstration's end-effector velocities are simply replayed. This enables convenient acquisition of a complex interaction trajectory, without actually needing to explicitly learn a policy. Real-world experiments on 8 everyday tasks show that our method can learn a diverse range of skills from a single human demonstration, whilst also yielding a stable and interpretable controller.