Learning a visuomotor controller for real world robotic grasping using simulated depth images
This addresses the challenge of enabling robots to perform reliable grasping in household applications, though it is incremental as it builds on existing simulation-based training methods.
The paper tackles the problem of robotic grasping in unstructured real-world environments by learning a closed-loop visuomotor controller using simulated depth images, and it significantly outperforms a strong baseline in the presence of kinematic noise, perceptual errors, and object disturbances.
We want to build robots that are useful in unstructured real world applications, such as doing work in the household. Grasping in particular is an important skill in this domain, yet it remains a challenge. One of the key hurdles is handling unexpected changes or motion in the objects being grasped and kinematic noise or other errors in the robot. This paper proposes an approach to learning a closed-loop controller for robotic grasping that dynamically guides the gripper to the object. We use a wrist-mounted sensor to acquire depth images in front of the gripper and train a convolutional neural network to learn a distance function to true grasps for grasp configurations over an image. The training sensor data is generated in simulation, a major advantage over previous work that uses real robot experience, which is costly to obtain. Despite being trained in simulation, our approach works well on real noisy sensor images. We compare our controller in simulated and real robot experiments to a strong baseline for grasp pose detection, and find that our approach significantly outperforms the baseline in the presence of kinematic noise, perceptual errors and disturbances of the object during grasping.