FLASH: Efficient Visuomotor Policy via Sparse Sampling
For real-time robotic control, FLASH dramatically reduces inference latency while maintaining high success rates, addressing a critical bottleneck in deploying generative visuomotor policies.
FLASH Policy replaces discrete action-chunk generation with continuous Legendre polynomial trajectory representation, enabling single-step flow matching inference that achieves state-of-the-art success rates (≥92%) and per-episode inference time of 31.40 ms (up to 175× faster than diffusion policies).
Generative models such as diffusion and flow matching have become dominant paradigms for visuomotor policy learning, yet their reliance on iterative denoising incurs high inference latency incompatible with real-time robotic control. We present Fast Legendre-polynomial Action policy via Sparse History-anchored flow (FLASH Policy), which replaces discrete action-chunk generation with continuous Legendre polynomial trajectory representation. Specifically, by fitting expert demonstrations under sparse temporal sampling, FLASH enables a single inference to cover a significantly extended action horizon. To further accelerate generation, FLASH initiates the flow matching process from history polynomial coefficients rather than uninformative Gaussian noise, shortening the transport distance and enabling accurate single-step inference. Moreover, analytic polynomial differentiation directly provides desired velocity feed-forward signals to the torque controller without numerical approximation. Extensive experiments on five simulated and two real-world manipulation tasks demonstrate that FLASH achieves state-of-the-art success rates ($\ge 92\%$ across all tasks), a per-episode inference time of $31.40\,ms$ (up to $175\times$ faster than diffusion policies and $18\times$ faster than prior flow matching policies), up to $4\times$ faster training convergence than ACT, and $5\times$ to $7\times$ reduction in controller tracking error compared to discrete-action baselines.