RO CL HC LG MAAug 7, 2025

Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation

Albert Yu, Chengshu Li, Luca Macesanu, Arnav Balaji, Ruchira Ray, Raymond Mooney, Roberto Martín-Martín

arXiv:2508.05535v13 citationsh-index: 19

Originality Incremental advance

AI Analysis

This addresses the challenge of adapting robotic systems to diverse human partners in collaborative manipulation, though it is incremental as it builds on existing mixed-initiative paradigms.

The paper tackled the problem of enabling effective human-robot collaboration in long-horizon tasks by developing MICoBot, a system using mixed-initiative dialog to coordinate task steps, which significantly improved task success and user experience compared to baselines in evaluations with 18 human participants over 27 hours.

Effective robotic systems for long-horizon human-robot collaboration must adapt to a wide range of human partners, whose physical behavior, willingness to assist, and understanding of the robot's capabilities may change over time. This demands a tightly coupled communication loop that grants both agents the flexibility to propose, accept, or decline requests as they coordinate toward completing the task effectively. We apply a Mixed-Initiative dialog paradigm to Collaborative human-roBot teaming and propose MICoBot, a system that handles the common scenario where both agents, using natural language, take initiative in formulating, accepting, or rejecting proposals on who can best complete different steps of a task. To handle diverse, task-directed dialog, and find successful collaborative strategies that minimize human effort, MICoBot makes decisions at three levels: (1) a meta-planner considers human dialog to formulate and code a high-level collaboration strategy, (2) a planner optimally allocates the remaining steps to either agent based on the robot's capabilities (measured by a simulation-pretrained affordance model) and the human's estimated availability to help, and (3) an action executor decides the low-level actions to perform or words to say to the human. Our extensive evaluations in simulation and real-world -- on a physical robot with 18 unique human participants over 27 hours -- demonstrate the ability of our method to effectively collaborate with diverse human users, yielding significantly improved task success and user experience than a pure LLM baseline and other agent allocation models. See additional videos and materials at https://robin-lab.cs.utexas.edu/MicoBot/.

View on arXiv PDF

Similar