CLAILGFeb 11, 2022

Dual Task Framework for Improving Persona-grounded Dialogue Dataset

arXiv:2202.05435v28 citations
AI Analysis

This work addresses data quality issues for persona-conditioned dialogue agents, offering an orthogonal improvement applicable to any model, though it is incremental as it builds on existing datasets and tasks.

The paper tackles annotation artifacts in persona-grounded dialogue datasets by introducing a data-centric approach that augments relevant personas using a dual-task framework, resulting in an 11.7 point accuracy gain over pre-trained language models on Persona-Chat.

This paper introduces a simple yet effective data-centric approach for the task of improving persona-conditioned dialogue agents. Prior model-centric approaches unquestioningly depend on the raw crowdsourced benchmark datasets such as Persona-Chat. In contrast, we aim to fix annotation artifacts in benchmarking, which is orthogonally applicable to any dialogue model. Specifically, we augment relevant personas to improve dialogue dataset/agent, by leveraging the primal-dual structure of the two tasks, predicting dialogue responses and personas based on each other. Experiments on Persona-Chat show that our approach outperforms pre-trained LMs by an 11.7 point gain in terms of accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes