CLAug 17, 2022

SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation

arXiv:2208.08094v5580 citationsh-index: 22Has Code
AI Analysis

This addresses the need for automated, fine-grained evaluation in dialogue systems, offering a novel self-supervised approach that is incremental in improving evaluation accuracy.

The paper tackles the problem of fine-grained dialogue evaluation by proposing a self-supervised framework that models the correlation between turn and dialogue quality, achieving high consistency with human evaluations and outperforming state-of-the-art models on multiple benchmarks.

This paper introduces a novel Self-supervised Fine-grained Dialogue Evaluation framework (SelF-Eval). The core idea is to model the correlation between turn quality and the entire dialogue quality. We first propose a novel automatic data construction method that can automatically assign fine-grained scores for arbitrarily dialogue data. Then we train \textbf{SelF-Eval} with a multi-level contrastive learning schema which helps to distinguish different score levels. Experimental results on multiple benchmarks show that SelF-Eval is highly consistent with human evaluations and better than the state-of-the-art models. We give a detailed analysis of the experiments in this paper. Our code is available on GitHub.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes