CL AI HCJan 10, 2024

An Analysis of User Behaviors for Objectively Evaluating Spoken Dialogue Systems

Koji Inoue, Divesh Lala, Keiko Ochi, Tatsuya Kawahara, Gabriel Skantze

arXiv:2401.04867v21.01 citationsh-index: 40

Originality Incremental advance

AI Analysis

This work addresses the need for reproducible evaluation in dialogue systems research, though it is incremental as it builds on existing methods by analyzing behavioral correlations.

The paper tackled the challenge of objectively evaluating spoken dialogue systems by proposing a framework based on user behaviors, revealing that specific behaviors like utterance counts and turn-taking metrics correlate with subjective scores in different social dialogue tasks.

Establishing evaluation schemes for spoken dialogue systems is important, but it can also be challenging. While subjective evaluations are commonly used in user experiments, objective evaluations are necessary for research comparison and reproducibility. To address this issue, we propose a framework for indirectly but objectively evaluating systems based on users' behaviors. In this paper, to this end, we investigate the relationship between user behaviors and subjective evaluation scores in social dialogue tasks: attentive listening, job interview, and first-meeting conversation. The results reveal that in dialogue tasks where user utterances are primary, such as attentive listening and job interview, indicators like the number of utterances and words play a significant role in evaluation. Observing disfluency also can indicate the effectiveness of formal tasks, such as job interview. On the other hand, in dialogue tasks with high interactivity, such as first-meeting conversation, behaviors related to turn-taking, like average switch pause length, become more important. These findings suggest that selecting appropriate user behaviors can provide valuable insights for objective evaluation in each social dialogue task.

View on arXiv PDF

Similar