RO AIMar 4

Cognition to Control - Multi-Agent Learning for Human-Humanoid Collaborative Transport

arXiv:2603.03768v14.01 citations

Originality Highly original

AI Analysis

This work addresses the problem of effective human-robot collaboration for individuals and organizations working with humanoid robots, providing an incremental solution to existing challenges in this field.

The authors tackled the problem of human-robot collaboration in multi-agent settings, achieving higher success and robustness rates compared to single-agent and end-to-end baselines. Their approach resulted in stable coordination and emergent leader-follower behaviors in collaborative manipulation tasks.

Effective human-robot collaboration (HRC) requires translating high-level intent into contact-stable whole-body motion while continuously adapting to a human partner. Many vision-language-action (VLA) systems learn end-to-end mappings from observations and instructions to actions, but they often emphasize reactive (System 1-like) behavior and leave under-specified how sustained System 2-style deliberation can be integrated with reliable, low-latency continuous control. This gap is acute in multi-agent HRC, where long-horizon coordination decisions and physical execution must co-evolve under contact, feasibility, and safety constraints. We address this limitation with cognition-to-control (C2C), a three-layer hierarchy that makes the deliberation-to-control pathway explicit: (i) a VLM-based grounding layer that maintains persistent scene referents and infers embodiment-aware affordances/constraints; (ii) a deliberative skill/coordination layer-the System 2 core-that optimizes long-horizon skill choices and sequences under human-robot coupling via decentralized MARL cast as a Markov potential game with a shared potential encoding task progress; and (iii) a whole-body control layer that executes the selected skills at high frequency while enforcing kinematic/dynamic feasibility and contact stability. The deliberative layer is realized as a residual policy relative to a nominal controller, internalizing partner dynamics without explicit role assignment. Experiments on collaborative manipulation tasks show higher success and robustness than single-agent and end-to-end baselines, with stable coordination and emergent leader-follower behaviors.

View on arXiv PDF

Similar