On Robot Grasp Learning Using Equivariant Models
This addresses the problem of sample inefficiency in robot grasp learning for robotics applications, offering a novel method to reduce training data requirements.
The paper tackles the challenge of real-world grasp detection by leveraging the SE(2)-equivariant structure of planar grasp functions to constrain neural networks, enabling end-to-end training from scratch on a physical robot with only 600 grasp attempts and learning in under 1.5 hours.
Real-world grasp detection is challenging due to the stochasticity in grasp dynamics and the noise in hardware. Ideally, the system would adapt to the real world by training directly on physical systems. However, this is generally difficult due to the large amount of training data required by most grasp learning models. In this paper, we note that the planar grasp function is $\SE(2)$-equivariant and demonstrate that this structure can be used to constrain the neural network used during learning. This creates an inductive bias that can significantly improve the sample efficiency of grasp learning and enable end-to-end training from scratch on a physical robot with as few as $600$ grasp attempts. We call this method Symmetric Grasp learning (SymGrasp) and show that it can learn to grasp ``from scratch'' in less that 1.5 hours of physical robot time.