RO AIJun 7

HARBOR: A Harness Framework for Agentic Robot Reinforcement Learning

Zechu Li, Yufeng Jin, Xiaoyang Liu, Puze Liu, Vignesh Prasad, Carlo D'Eramo, Georgia Chalvatzaki

arXiv:2606.08610v110.2

Predicted impact top 23% in RO · last 90 daysOriginality Incremental advance

AI Analysis

For robotics researchers and practitioners, HARBOR reduces the expert effort required for RL workflows, but the approach is incremental as it automates existing pipeline stages rather than introducing new RL algorithms.

HARBOR automates the robot RL pipeline from environment setup to policy training in simulation, reducing engineering effort while matching or improving over default configurations across 16 tasks in manipulation, locomotion, and bimanual dexterous control, with policies transferable to real robots.

Reinforcement learning (RL) has become a powerful paradigm for robot learning, particularly in sim-to-real settings, but its broader adoption remains limited by the engineering pipeline surrounding the algorithms. Building tasks, shaping rewards, and tuning hyperparameters require substantial expert effort, making RL workflows costly and difficult to scale. We introduce HARBOR, an agentic framework that frames robot RL automation as a harness-engineering problem: given a simulator codebase and a task specification, it automates the workflow from environment setup to policy training in simulation. HARBOR decomposes such high-level objectives into bounded stages executed by specialized agents through standardized commands, persistent artifacts, executable gates, and reusable knowledge, and scales iteration via decentralized parallel trials and experience learning across runs. We evaluate HARBOR across 6 benchmarks and 16 tasks in total, spanning manipulation, locomotion, and bimanual dexterous control. We demonstrate that HARBOR automates the simulation RL workflow end-to-end, designs rewards, tunes algorithms to match or improve over default configurations, and reduces engineering effort at practical token and wall-clock cost; the resulting policies can also be transferred to real robots.

View on arXiv PDF

Similar