Teaching robots to imitate a human with no on-teacher sensors. What are the key challenges?
This addresses the challenge of making robot imitation learning more accessible by eliminating the need for specialized tracking devices, though it appears incremental in its approach.
The paper tackles the problem of teaching robots object manipulation tasks from human demonstrations using only RGB or RGB-D cameras without on-teacher sensors, focusing on challenges like sensor selection and 6DoF pose estimation, and presents an architecture for transferring imitated tasks to simulated and real robot environments.
In this paper, we consider the problem of learning object manipulation tasks from human demonstration using RGB or RGB-D cameras. We highlight the key challenges in capturing sufficiently good data with no tracking devices - starting from sensor selection and accurate 6DoF pose estimation to natural language processing. In particular, we focus on two showcases: gluing task with a glue gun and simple block-stacking with variable blocks. Furthermore, we discuss how a linguistic description of the task could help to improve the accuracy of task description. We also present the whole architecture of our transfer of the imitated task to the simulated and real robot environment.