DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes
This addresses the data scarcity problem for dexterous robotic grasping in cluttered environments, representing a strong specific gain rather than an incremental improvement.
The paper tackles the challenge of dexterous grasping in cluttered scenes by introducing a large-scale synthetic dataset and a two-stage generative method using a diffusion model, achieving a 90.7% real-world grasping success rate.
Grasping in cluttered scenes remains highly challenging for dexterous hands due to the scarcity of data. To address this problem, we present a large-scale synthetic benchmark, encompassing 1319 objects, 8270 scenes, and 427 million grasps. Beyond benchmarking, we also propose a novel two-stage grasping method that learns efficiently from data by using a diffusion model that conditions on local geometry. Our proposed generative method outperforms all baselines in simulation experiments. Furthermore, with the aid of test-time-depth restoration, our method demonstrates zero-shot sim-to-real transfer, attaining 90.7% real-world dexterous grasping success rate in cluttered scenes.