ROAug 10, 2021

Learn to Grasp with Less Supervision: A Data-Efficient Maximum Likelihood Grasp Sampling Loss

Xinghao Zhu, Yefan Zhou, Yongxiang Fan, Lingfeng Sun, Jianyu Chen, Masayoshi Tomizuka

arXiv:2110.01379v28.915 citationsh-index: 93

Originality Incremental advance

AI Analysis

This addresses the data efficiency issue for robotic manipulation tasks, though it is incremental as it builds on existing deep grasping models.

The paper tackles the problem of data sparsity in robotic grasping by proposing a Maximum Likelihood Grasp Sampling Loss (MLGSL), which enables training with only 2 labels per image, achieving 8x more data efficiency and a 90.7% grasp success rate on household objects.

Robotic grasping for a diverse set of objects is essential in many robot manipulation tasks. One promising approach is to learn deep grasping models from large training datasets of object images and grasp labels. However, empirical grasping datasets are typically sparsely labeled (i.e., a small number of successful grasp labels in each image). The data sparsity issue can lead to insufficient supervision and false-negative labels and thus results in poor learning results. This paper proposes a Maximum Likelihood Grasp Sampling Loss (MLGSL) to tackle the data sparsity issue. The proposed method supposes that successful grasps are stochastically sampled from the predicted grasp distribution and maximizes the observing likelihood. MLGSL is utilized for training a fully convolutional network that generates thousands of grasps simultaneously. Training results suggest that models based on MLGSL can learn to grasp with datasets composing of 2 labels per image. Compared to previous works, which require training datasets of 16 labels per image, MLGSL is 8x more data-efficient. Meanwhile, physical robot experiments demonstrate an equivalent performance at a 90.7% grasp success rate on household objects.

View on arXiv PDF

Similar