CLMar 12, 2025

TRACE: Real-Time Multimodal Common Ground Tracking in Situated Collaborative Dialogues

Hannah VanderHoeven, Brady Bhalla, Ibrahim Khebour, Austin Youngren, Videep Venkatesha, Mariah Bradford, Jack Fitzgerald, Carlos Mabrey, Jingxuan Tu, Yifan Zhu, Kenneth Lai, Changsoo Jung

arXiv:2503.09511v118.210 citationsh-index: 15NAACL

Originality Incremental advance

AI Analysis

This work addresses the need for AI agents to mediate multiparty, multimodal collaborations, representing an incremental step forward in this domain.

The paper tackles the problem of tracking common ground in real-time during situated collaborative dialogues by developing TRACE, a system that uses multimodal inputs to monitor participants' epistemic positions and beliefs, achieving fast performance for live interaction.

We present TRACE, a novel system for live *common ground* tracking in situated collaborative tasks. With a focus on fast, real-time performance, TRACE tracks the speech, actions, gestures, and visual attention of participants, uses these multimodal inputs to determine the set of task-relevant propositions that have been raised as the dialogue progresses, and tracks the group's epistemic position and beliefs toward them as the task unfolds. Amid increased interest in AI systems that can mediate collaborations, TRACE represents an important step forward for agents that can engage with multiparty, multimodal discourse.

View on arXiv PDF

Similar