CL AI LGDec 6, 2024

DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling

Minzheng Wang, Xinghua Zhang, Kun Chen, Nan Xu, Haiyang Yu, Fei Huang, Wenji Mao, Yongbin Li

arXiv:2412.04905v46.18 citationsh-index: 19Has CodeACL

Originality Incremental advance

AI Analysis

This work addresses the problem of precise modeling and assessment for LLM-based dialogue systems, though it appears incremental as it builds on existing dialogue studies.

The authors tackled the lack of systematic benchmarks for modeling dialogue elements across stages, introducing the DEMO benchmark and agent, which shows that current LLMs have room for improvement and their agent performs well in element modeling and out-of-domain tasks.

Large language models (LLMs) enabled dialogue systems have become one of the central modes in human-machine interaction, which bring about vast amounts of conversation logs and increasing demand for dialogue generation. The dialogue's life-cycle spans from $\textit{Prelude}$ through $\textit{Interlocution}$ to $\textit{Epilogue}$, encompassing rich dialogue elements. Despite large volumes of dialogue-related studies, there is a lack of systematic investigation into the dialogue stages to frame benchmark construction that covers comprehensive dialogue elements. This hinders the precise modeling, generation and assessment of LLMs-based dialogue systems. To bridge this gap, in this paper, we introduce a new research task--$\textbf{D}$ialogue $\textbf{E}$lement $\textbf{MO}$deling, including $\textit{Element Awareness}$ and $\textit{Dialogue Agent Interaction}$, and propose a novel benchmark, $\textbf{DEMO}$, designed for a comprehensive dialogue modeling and assessment. On this basis, we further build the DEMO agent with the adept ability to model dialogue elements via imitation learning. Extensive experiments on DEMO indicate that current representative LLMs still have considerable potential for enhancement, and our DEMO agent performs well in both dialogue element modeling and out-of-domain tasks.

View on arXiv PDF Code

Similar