CLAIJan 17, 2022

An Empirical Study on the Overlapping Problem of Open-Domain Dialogue Datasets

arXiv:2201.06219v2585 citations
AI Analysis

This work highlights a data integrity issue for researchers using open-domain dialogue benchmarks, though it is incremental as it focuses on dataset cleaning rather than novel methods.

The study identified overlapping dialogues in DailyDialog and OpenSubtitles datasets, which can artificially inflate performance metrics, and addressed this by cleaning the datasets and establishing a data processing procedure.

Open-domain dialogue systems aim to converse with humans through text, and dialogue research has heavily relied on benchmark datasets. In this work, we observe the overlapping problem in DailyDialog and OpenSubtitles, two popular open-domain dialogue benchmark datasets. Our systematic analysis then shows that such overlapping can be exploited to obtain fake state-of-the-art performance. Finally, we address this issue by cleaning these datasets and setting up a proper data processing procedure for future research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes