SeFA-Policy: Fast and Accurate Visuomotor Policy Learning with Selective Flow Alignment
This work addresses a key limitation in robotic imitation learning for real-time manipulation tasks, offering a scalable solution, though it appears incremental as it builds on existing rectified flow approaches.
The paper tackled the problem of accumulated error and instability in visuomotor policy learning by introducing Selective Flow Alignment (SeFA), which selectively corrects generated actions using expert demonstrations to maintain consistency with observations, resulting in superior accuracy and robustness while reducing inference latency by over 98%.
Developing efficient and accurate visuomotor policies poses a central challenge in robotic imitation learning. While recent rectified flow approaches have advanced visuomotor policy learning, they suffer from a key limitation: After iterative distillation, generated actions may deviate from the ground-truth actions corresponding to the current visual observation, leading to accumulated error as the reflow process repeats and unstable task execution. We present Selective Flow Alignment (SeFA), an efficient and accurate visuomotor policy learning framework. SeFA resolves this challenge by a selective flow alignment strategy, which leverages expert demonstrations to selectively correct generated actions and restore consistency with observations, while preserving multimodality. This design introduces a consistency correction mechanism that ensures generated actions remain observation-aligned without sacrificing the efficiency of one-step flow inference. Extensive experiments across both simulated and real-world manipulation tasks show that SeFA Policy surpasses state-of-the-art diffusion-based and flow-based policies, achieving superior accuracy and robustness while reducing inference latency by over 98%. By unifying rectified flow efficiency with observation-consistent action generation, SeFA provides a scalable and dependable solution for real-time visuomotor policy learning. Code is available on https://github.com/RongXueZoe/SeFA.