CLJul 25, 2022

DialCrowd 2.0: A Quality-Focused Dialog System Crowdsourcing Toolkit

arXiv:2207.12551v1586 citationsh-index: 63
Originality Synthesis-oriented
AI Analysis

This addresses data quality issues for dialog system developers and researchers, though it is incremental as it builds on existing crowdsourcing tools.

The paper tackles the problem of low-quality crowdsourced data for dialog systems by introducing DialCrowd 2.0, a toolkit that helps requesters improve task presentation and worker communication to obtain higher quality data, directly applicable to current developer workflows.

Dialog system developers need high-quality data to train, fine-tune and assess their systems. They often use crowdsourcing for this since it provides large quantities of data from many workers. However, the data may not be of sufficiently good quality. This can be due to the way that the requester presents a task and how they interact with the workers. This paper introduces DialCrowd 2.0 to help requesters obtain higher quality data by, for example, presenting tasks more clearly and facilitating effective communication with workers. DialCrowd 2.0 guides developers in creating improved Human Intelligence Tasks (HITs) and is directly applicable to the workflows used currently by developers and researchers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes