CLLGMar 6, 2019

Negative Training for Neural Dialogue Response Generation

arXiv:1903.02134v51032 citations
Originality Incremental advance
AI Analysis

This addresses a key issue for developers and users of open-domain dialogue systems by reducing harmful or boring outputs, though it is incremental as it builds on existing training methods.

The paper tackles the problem of undesirable generation behaviors like malicious or generic responses in neural dialogue models by proposing a Negative Training framework that fine-tunes models using samples exhibiting these behaviors, resulting in significant reductions in malicious response hit rates and improved response diversity.

Although deep learning models have brought tremendous advancements to the field of open-domain dialogue response generation, recent research results have revealed that the trained models have undesirable generation behaviors, such as malicious responses and generic (boring) responses. In this work, we propose a framework named "Negative Training" to minimize such behaviors. Given a trained model, the framework will first find generated samples that exhibit the undesirable behavior, and then use them to feed negative training signals for fine-tuning the model. Our experiments show that negative training can significantly reduce the hit rate of malicious responses, or discourage frequent responses and improve response diversity.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes