HiPolicy: Hierarchical Multi-Frequency Action Chunking for Policy Learning
This addresses a fundamental challenge in robotic imitation learning, offering a method to enhance both planning and control, though it appears incremental as it builds on existing action chunking approaches.
The paper tackles the trade-off between long-horizon planning and fine-grained control in robotic imitation learning by proposing HiPolicy, a hierarchical multi-frequency action chunking framework, which improves performance and efficiency in simulated and real-world tasks.
Robotic imitation learning faces a fundamental trade-off between modeling long-horizon dependencies and enabling fine-grained closed-loop control. Existing fixed-frequency action chunking approaches struggle to achieve both. Building on this insight, we propose HiPolicy, a hierarchical multi-frequency action chunking framework that jointly predicts action sequences at different frequencies to capture both coarse high-level plans and precise reactive motions. We extract and fuse hierarchical features from history observations aligned to each frequency for multi-frequency chunk generation, and introduce an entropy-guided execution mechanism that adaptively balances long-horizon planning with fine-grained control based on action uncertainty. Experiments on diverse simulated benchmarks and real-world manipulation tasks show that HiPolicy can be seamlessly integrated into existing 2D and 3D generative policies, delivering consistent improvements in performance while significantly enhancing execution efficiency.