WarmPrior: Straightening Flow-Matching Policies with Temporal Priors
For robotic control, this work identifies the source distribution as an underexplored design axis, offering a simple improvement to generative policies.
Replacing the standard Gaussian source distribution with WarmPrior, a temporally grounded prior from recent action history, consistently improves success rates on robotic manipulation tasks by producing straighter probability paths, and also enhances sample efficiency and final performance in reinforcement learning.
Generative policies based on diffusion and flow matching have become a dominant paradigm for visuomotor robotic control. We show that replacing the standard Gaussian source distribution with WarmPrior, a simple temporally grounded prior constructed from readily available recent action history, consistently improves success rates on robotic manipulation tasks. We trace this gain to markedly straighter probability paths, echoing the effect of optimal-transport couplings in Rectified Flow. Beyond standard behavior cloning, WarmPrior also reshapes the exploration distribution in prior-space reinforcement learning, improving both sample efficiency and final performance. Collectively, these results identify the source distribution as an important and underexplored design axis in generative robot control.