ROLGDec 22, 2025

Translating Flow to Policy via Hindsight Online Imitation

arXiv:2512.19269v11 citationsh-index: 12
Originality Incremental advance
AI Analysis

This work addresses the problem of limited robot data for scalable and transferable robot learning, with incremental improvements in policy acquisition from cross-embodiment video data.

The paper tackles the challenge of grounding high-level task plans into executable robot actions by improving the low-level policy through online interactions, achieving over 2x performance improvement across diverse manipulation tasks in simulation and the physical world.

Recent advances in hierarchical robot systems leverage a high-level planner to propose task plans and a low-level policy to generate robot actions. This design allows training the planner on action-free or even non-robot data sources (e.g., videos), providing transferable high-level guidance. Nevertheless, grounding these high-level plans into executable actions remains challenging, especially with the limited availability of high-quality robot data. To this end, we propose to improve the low-level policy through online interactions. Specifically, our approach collects online rollouts, retrospectively annotates the corresponding high-level goals from achieved outcomes, and aggregates these hindsight-relabeled experiences to update a goal-conditioned imitation policy. Our method, Hindsight Flow-conditioned Online Imitation (HinFlow), instantiates this idea with 2D point flows as the high-level planner. Across diverse manipulation tasks in both simulation and physical world, our method achieves more than $2\times$ performance improvement over the base policy, significantly outperforming the existing methods. Moreover, our framework enables policy acquisition from planners trained on cross-embodiment video data, demonstrating its potential for scalable and transferable robot learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes