CLSep 12, 2022

Open-Domain Dialog Evaluation using Follow-Ups Likelihood

arXiv:2209.05185v1580 citationsh-index: 64
Originality Incremental advance
AI Analysis

This addresses the challenge of reliable automated dialog evaluation for researchers and developers, though it appears incremental as it builds on existing language model techniques.

The paper tackles the problem of automatic evaluation for open-domain dialogs by introducing a method that measures the likelihood of language model-generated follow-ups, achieving the highest correlation with human evaluations among twelve existing methods.

Automatic evaluation of open-domain dialogs remains an unsolved problem. Moreover, existing methods do not correlate strongly with human annotations. This paper presents a new automated evaluation method using follow-ups: we measure the probability that a language model will continue the conversation with a fixed set of follow-ups (e.g., not really relevant here, what are you trying to say). When compared against twelve existing methods, our new evaluation achieves the highest correlation with human evaluations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes