AffordanceGrasp-R1:Leveraging Reasoning-Based Affordance Segmentation with Reinforcement Learning for Robotic Grasping
This addresses robotic manipulation problems for applications requiring complex language-conditioned tasks, representing a hybrid incremental improvement.
The paper tackles robotic grasping by introducing AffordanceGrasp-R1, a framework that combines reasoning-based affordance segmentation with reinforcement learning, resulting in consistent outperformance of state-of-the-art methods on benchmark datasets and validated robustness in real-world scenarios.
We introduce AffordanceGrasp-R1, a reasoning-driven affordance segmentation framework for robotic grasping that combines a chain-of-thought (CoT) cold-start strategy with reinforcement learning to enhance deduction and spatial grounding. In addition, we redesign the grasping pipeline to be more context-aware by generating grasp candidates from the global scene point cloud and subsequently filtering them using instruction-conditioned affordance masks. Extensive experiments demonstrate that AffordanceGrasp-R1 consistently outperforms state-of-the-art (SOTA) methods on benchmark datasets, and real-world robotic grasping evaluations further validate its robustness and generalization under complex language-conditioned manipulation scenarios.