Dexterous Imitation Made Easy: A Learning-Based Framework for Efficient Dexterous Manipulation
This addresses the problem of poor sample efficiency in dexterous manipulation for robotics, though it is incremental as it builds on existing imitation learning methods.
The paper tackles the challenge of learning dexterous manipulation by proposing DIME, a framework that uses a single RGB camera for human demonstration and standard imitation learning, achieving success in complex tasks like flipping, spinning, and rotating objects with the Allegro hand in simulation and real-world benchmarks.
Optimizing behaviors for dexterous manipulation has been a longstanding challenge in robotics, with a variety of methods from model-based control to model-free reinforcement learning having been previously explored in literature. Perhaps one of the most powerful techniques to learn complex manipulation strategies is imitation learning. However, collecting and learning from demonstrations in dexterous manipulation is quite challenging. The complex, high-dimensional action-space involved with multi-finger control often leads to poor sample efficiency of learning-based methods. In this work, we propose 'Dexterous Imitation Made Easy' (DIME) a new imitation learning framework for dexterous manipulation. DIME only requires a single RGB camera to observe a human operator and teleoperate our robotic hand. Once demonstrations are collected, DIME employs standard imitation learning methods to train dexterous manipulation policies. On both simulation and real robot benchmarks we demonstrate that DIME can be used to solve complex, in-hand manipulation tasks such as 'flipping', 'spinning', and 'rotating' objects with the Allegro hand. Our framework along with pre-collected demonstrations is publicly available at https://nyu-robot-learning.github.io/dime.