CommCP: Efficient Multi-Agent Coordination via LLM-Based Communication with Conformal Prediction

arXiv:2602.06038v12 citationsh-index: 3
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient information gathering and coordination for heterogeneous robots in household scenarios, representing an incremental advancement in multi-agent embodied AI.

The paper tackles the problem of multi-agent coordination for embodied question answering by formalizing it as a multi-agent multi-task EQA (MM-EQA) problem and proposing CommCP, an LLM-based communication framework with conformal prediction, which significantly improves task success rate and exploration efficiency over baselines.

To complete assignments provided by humans in natural language, robots must interpret commands, generate and answer relevant questions for scene understanding, and manipulate target objects. Real-world deployments often require multiple heterogeneous robots with different manipulation capabilities to handle different assignments cooperatively. Beyond the need for specialized manipulation skills, effective information gathering is important in completing these assignments. To address this component of the problem, we formalize the information-gathering process in a fully cooperative setting as an underexplored multi-agent multi-task Embodied Question Answering (MM-EQA) problem, which is a novel extension of canonical Embodied Question Answering (EQA), where effective communication is crucial for coordinating efforts without redundancy. To address this problem, we propose CommCP, a novel LLM-based decentralized communication framework designed for MM-EQA. Our framework employs conformal prediction to calibrate the generated messages, thereby minimizing receiver distractions and enhancing communication reliability. To evaluate our framework, we introduce an MM-EQA benchmark featuring diverse, photo-realistic household scenarios with embodied questions. Experimental results demonstrate that CommCP significantly enhances the task success rate and exploration efficiency over baselines. The experiment videos, code, and dataset are available on our project website: https://comm-cp.github.io.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes