Deep Reinforcement Learning Based on Local GNN for Goal-conditioned Deformable Object Rearranging
This work addresses the challenge of generalizing deformable object manipulation for robots, enabling more versatile applications beyond specific tasks, though it is incremental as it builds on existing GNN and attention mechanisms.
The paper tackles the problem of transferring deformable object rearrangement from simulation to reality by proposing a local Graph Neural Network (GNN) method that encodes keypoints from images, using self-attention and cross-attention for action generation, and demonstrates effectiveness in 1-D and 2-D tasks with easy real-world transfer via fine-tuning a keypoint detector.
Object rearranging is one of the most common deformable manipulation tasks, where the robot needs to rearrange a deformable object into a goal configuration. Previous studies focus on designing an expert system for each specific task by model-based or data-driven approaches and the application scenarios are therefore limited. Some research has been attempting to design a general framework to obtain more advanced manipulation capabilities for deformable rearranging tasks, with lots of progress achieved in simulation. However, transferring from simulation to reality is difficult due to the limitation of the end-to-end CNN architecture. To address these challenges, we design a local GNN (Graph Neural Network) based learning method, which utilizes two representation graphs to encode keypoints detected from images. Self-attention is applied for graph updating and cross-attention is applied for generating manipulation actions. Extensive experiments have been conducted to demonstrate that our framework is effective in multiple 1-D (rope, rope ring) and 2-D (cloth) rearranging tasks in simulation and can be easily transferred to a real robot by fine-tuning a keypoint detector.