Correct Me if I am Wrong: Interactive Learning for Robotic Manipulation
This work addresses the problem of reducing impractical trial-and-error time in robot learning for manipulation tasks, offering a more efficient alternative to deep reinforcement learning and imitation learning.
The paper tackles the challenge of training robots for complex manipulation tasks from visual observations by introducing an interactive learning framework called CEILing, which combines corrective and evaluative human feedback to train a stochastic policy, achieving effective task solving in less than one hour of real-world training.
Learning to solve complex manipulation tasks from visual observations is a dominant challenge for real-world robot learning. Although deep reinforcement learning algorithms have recently demonstrated impressive results in this context, they still require an impractical amount of time-consuming trial-and-error iterations. In this work, we consider the promising alternative paradigm of interactive learning in which a human teacher provides feedback to the policy during execution, as opposed to imitation learning where a pre-collected dataset of perfect demonstrations is used. Our proposed CEILing (Corrective and Evaluative Interactive Learning) framework combines both corrective and evaluative feedback from the teacher to train a stochastic policy in an asynchronous manner, and employs a dedicated mechanism to trade off human corrections with the robot's own experience. We present results obtained with our framework in extensive simulation and real-world experiments to demonstrate that CEILing can effectively solve complex robot manipulation tasks directly from raw images in less than one hour of real-world training.