Hierarchical Policies for Cluttered-Scene Grasping with Latent Plans
This addresses a longstanding problem in robotic manipulation for cluttered environments, offering an end-to-end learning approach that scales better than previous methods.
The paper tackles 6D grasping in cluttered scenes by proposing a hierarchical framework that learns collision-free, target-driven grasping from partial point clouds, using latent plans and reinforcement learning, and demonstrates generalization to real-world tasks.
6D grasping in cluttered scenes is a longstanding problem in robotic manipulation. Open-loop manipulation pipelines may fail due to inaccurate state estimation, while most end-to-end grasping methods have not yet scaled to complex scenes with obstacles. In this work, we propose a new method for end-to-end learning of 6D grasping in cluttered scenes. Our hierarchical framework learns collision-free target-driven grasping based on partial point cloud observations. We learn an embedding space to encode expert grasping plans during training and a variational autoencoder to sample diverse grasping trajectories at test time. Furthermore, we train a critic network for plan selection and an option classifier for switching to an instance grasping policy through hierarchical reinforcement learning. We evaluate our method and compare against several baselines in simulation, as well as demonstrate that our latent planning can generalize to real-world cluttered-scene grasping tasks. Our videos and code can be found at https://sites.google.com/view/latent-grasping .