TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size
This work provides a scalable solution for cooperative human-object interaction for multi-agent systems, which is a significant step towards more realistic and adaptable humanoid control.
This paper addresses the challenge of cooperative human-object interaction (HOI) by developing TeamHOI, a framework that allows a single decentralized policy to manage cooperative HOIs with any number of agents. It achieves high success rates and coherent cooperation across diverse configurations, including 2 to 8 humanoid agents carrying objects of varied geometries.
Physics-based humanoid control has achieved remarkable progress in enabling realistic and high-performing single-agent behaviors, yet extending these capabilities to cooperative human-object interaction (HOI) remains challenging. We present TeamHOI, a framework that enables a single decentralized policy to handle cooperative HOIs across any number of cooperating agents. Each agent operates using local observations while attending to other teammates through a Transformer-based policy network with teammate tokens, allowing scalable coordination across variable team sizes. To enforce motion realism while addressing the scarcity of cooperative HOI data, we further introduce a masked Adversarial Motion Prior (AMP) strategy that uses single-human reference motions while masking object-interacting body parts during training. The masked regions are then guided through task rewards to produce diverse and physically plausible cooperative behaviors. We evaluate TeamHOI on a challenging cooperative carrying task involving two to eight humanoid agents and varied object geometries. Finally, to promote stable carrying, we design a team-size- and shape-agnostic formation reward. TeamHOI achieves high success rates and demonstrates coherent cooperation across diverse configurations with a single policy.