CVDec 8, 2025

InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction Graphs

arXiv:2512.07410v13 citationsh-index: 10
Originality Highly original
AI Analysis

This addresses the challenge of multi-agent coordination in humanoid control for robotics or simulation applications, representing a novel advancement beyond single-agent methods.

The paper tackled the problem of generating physically plausible multi-agent humanoid behaviors from text prompts, and the result was InterAgent, a framework that outperformed baselines to achieve state-of-the-art performance in producing coherent and semantically faithful interactions.

Humanoid agents are expected to emulate the complex coordination inherent in human social behaviors. However, existing methods are largely confined to single-agent scenarios, overlooking the physically plausible interplay essential for multi-agent interactions. To bridge this gap, we propose InterAgent, the first end-to-end framework for text-driven physics-based multi-agent humanoid control. At its core, we introduce an autoregressive diffusion transformer equipped with multi-stream blocks, which decouples proprioception, exteroception, and action to mitigate cross-modal interference while enabling synergistic coordination. We further propose a novel interaction graph exteroception representation that explicitly captures fine-grained joint-to-joint spatial dependencies to facilitate network learning. Additionally, within it we devise a sparse edge-based attention mechanism that dynamically prunes redundant connections and emphasizes critical inter-agent spatial relations, thereby enhancing the robustness of interaction modeling. Extensive experiments demonstrate that InterAgent consistently outperforms multiple strong baselines, achieving state-of-the-art performance. It enables producing coherent, physically plausible, and semantically faithful multi-agent behaviors from only text prompts. Our code and data will be released to facilitate future research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes