CLDec 7, 2020

The Lab vs The Crowd: An Investigation into Data Quality for Neural Dialogue Models

arXiv:2012.03855v1
AI Analysis

This study addresses the problem of data quality for researchers developing data-driven dialogue models, showing the trade-offs between lab and crowd-sourced data collection.

This paper investigates the impact of data collection settings on neural dialogue model performance. It found that models trained on lab-collected data required less than half the amount of data to achieve similar accuracy compared to models trained on crowd-sourced data for the same interaction task.

Challenges around collecting and processing quality data have hampered progress in data-driven dialogue models. Previous approaches are moving away from costly, resource-intensive lab settings, where collection is slow but where the data is deemed of high quality. The advent of crowd-sourcing platforms, such as Amazon Mechanical Turk, has provided researchers with an alternative cost-effective and rapid way to collect data. However, the collection of fluid, natural spoken or textual interaction can be challenging, particularly between two crowd-sourced workers. In this study, we compare the performance of dialogue models for the same interaction task but collected in two different settings: in the lab vs. crowd-sourced. We find that fewer lab dialogues are needed to reach similar accuracy, less than half the amount of lab data as crowd-sourced data. We discuss the advantages and disadvantages of each data collection method.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes