Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation
This addresses data efficiency and robustness in robotic manipulation for real-world applications, though it appears incremental as it builds on existing diffusion and equivariant methods.
The paper tackles visual robotic manipulation by proposing Diffusion-EDFs, an SE(3)-equivariant diffusion-based method that achieves effective end-to-end training with only 5 to 10 human demonstrations in under an hour, showing superior generalizability and robustness in benchmarks.
Diffusion generative modeling has become a promising approach for learning robotic manipulation tasks from stochastic human demonstrations. In this paper, we present Diffusion-EDFs, a novel SE(3)-equivariant diffusion-based approach for visual robotic manipulation tasks. We show that our proposed method achieves remarkable data efficiency, requiring only 5 to 10 human demonstrations for effective end-to-end training in less than an hour. Furthermore, our benchmark experiments demonstrate that our approach has superior generalizability and robustness compared to state-of-the-art methods. Lastly, we validate our methods with real hardware experiments. Project Website: https://sites.google.com/view/diffusion-edfs/home