CVAILGApr 19, 2023

Multipar-T: Multiparty-Transformer for Capturing Contingent Behaviors in Group Conversations

MIT
arXiv:2304.12204v15 citationsh-index: 26
Originality Incremental advance
AI Analysis

This addresses the problem of enabling AI agents to handle complex group interactions for applications in social AI, though it is incremental as it builds on existing transformer models with a novel attention mechanism.

The paper tackled the challenge of recognizing and interpreting multiparty behaviors in group conversations by proposing the Multiparty-Transformer (Multipar-T), which outperformed state-of-the-art approaches by 5.2% in average F-1 scores and up to 10.0% in individual class F-1 scores on a video-based group engagement detection benchmark.

As we move closer to real-world AI systems, AI agents must be able to deal with multiparty (group) conversations. Recognizing and interpreting multiparty behaviors is challenging, as the system must recognize individual behavioral cues, deal with the complexity of multiple streams of data from multiple people, and recognize the subtle contingent social exchanges that take place amongst group members. To tackle this challenge, we propose the Multiparty-Transformer (Multipar-T), a transformer model for multiparty behavior modeling. The core component of our proposed approach is the Crossperson Attention, which is specifically designed to detect contingent behavior between pairs of people. We verify the effectiveness of Multipar-T on a publicly available video-based group engagement detection benchmark, where it outperforms state-of-the-art approaches in average F-1 scores by 5.2% and individual class F-1 scores by up to 10.0%. Through qualitative analysis, we show that our Crossperson Attention module is able to discover contingent behavior.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes