CLAIHCLGMar 17, 2024

Improving Dialogue Agents by Decomposing One Global Explicit Annotation with Local Implicit Multimodal Feedback

MIT
arXiv:2403.11330v21 citationsh-index: 88
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving dialogue agents for conversational AI applications, representing an incremental advancement in reward modeling techniques.

The paper tackles the problem of aligning LLM-based dialogue agents by decomposing global explicit session-level rewards using local implicit multimodal feedback, resulting in consistent improvements across various conversational metrics compared to baseline methods.

We describe an approach for aligning an LLM-based dialogue agent based on global (i.e., dialogue-level) rewards, while also taking into account naturally-occurring multimodal signals. At a high level, our approach (dubbed GELI) learns a local, turn-level reward model by decomposing the human-provided Global Explicit (GE) session-level reward, using Local Implicit (LI) multimodal reward signals to crossmodally shape the reward decomposition step. This decomposed reward model is then used as part of the standard RHLF pipeline improve an LLM-based dialog agent. We run quantitative and qualitative human studies to evaluate the performance of our GELI approach, and find that it shows consistent improvements across various conversational metrics compared to baseline methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes