ReSeFlow: Rectifying SE(3)-Equivariant Policy Learning Flows
This work addresses inference efficiency for robotic manipulation policies, which is an incremental improvement combining existing techniques (SE(3)-equivariance and rectified flows) for a specific domain.
The paper tackles the problem of slow inference in SE(3)-equivariant diffusion models for robotic manipulation by introducing ReSeFlow, which rectifies these models to enable fast policy generation. The result shows that with only one inference step, ReSeFlow achieves up to 48.5% error reduction on a painting task and 21.9% reduction on a rotating triangle task compared to baseline methods requiring 100 steps.
Robotic manipulation in unstructured environments requires the generation of robust and long-horizon trajectory-level policy with conditions of perceptual observations and benefits from the advantages of SE(3)-equivariant diffusion models that are data-efficient. However, these models suffer from the inference time costs. Inspired by the inference efficiency of rectified flows, we introduce the rectification to the SE(3)-diffusion models and propose the ReSeFlow, i.e., Rectifying SE(3)-Equivariant Policy Learning Flows, providing fast, geodesic-consistent, least-computational policy generation. Crucially, both components employ SE(3)-equivariant networks to preserve rotational and translational symmetry, enabling robust generalization under rigid-body motions. With the verification on the simulated benchmarks, we find that the proposed ReSeFlow with only one inference step can achieve better performance with lower geodesic distance than the baseline methods, achieving up to a 48.5% error reduction on the painting task and a 21.9% reduction on the rotating triangle task compared to the baseline's 100-step inference. This method takes advantages of both SE(3) equivariance and rectified flow and puts it forward for the real-world application of generative policy learning models with the data and inference efficiency.