Graph-based Cluttered Scene Generation and Interactive Exploration using Deep Reinforcement Learning
This addresses the challenge of robotic manipulation in structured clutter for applications such as domestic assistance, though it is incremental in combining scene generation and exploration.
The paper tackles the problem of teaching a robotic agent to explore cluttered scenes like kitchens by generating stable scenes with hidden objects and then discovering them through interactive rearrangement, achieving significantly more objects hidden and discovered than baselines and demonstrating sim-to-real transfer on a real robot.
We introduce a novel method to teach a robotic agent to interactively explore cluttered yet structured scenes, such as kitchen pantries and grocery shelves, by leveraging the physical plausibility of the scene. We propose a novel learning framework to train an effective scene exploration policy to discover hidden objects with minimal interactions. First, we define a novel scene grammar to represent structured clutter. Then we train a Graph Neural Network (GNN) based Scene Generation agent using deep reinforcement learning (deep RL), to manipulate this Scene Grammar to create a diverse set of stable scenes, each containing multiple hidden objects. Given such cluttered scenes, we then train a Scene Exploration agent, using deep RL, to uncover hidden objects by interactively rearranging the scene. We show that our learned agents hide and discover significantly more objects than the baselines. We present quantitative results that prove the generalization capabilities of our agents. We also demonstrate sim-to-real transfer by successfully deploying the learned policy on a real UR10 robot to explore real-world cluttered scenes. The supplemental video can be found at https://www.youtube.com/watch?v=T2Jo7wwaXss.