IR CL LGFeb 25, 2019

Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss

Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten de Rijke

arXiv:1902.09191v198 citations

Originality Incremental advance

AI Analysis

This addresses the issue of uninteresting conversations in dialogue systems, though it is an incremental improvement over existing methods.

The paper tackles the low-diversity problem in Seq2Seq dialogue response generation, where models often produce generic responses, by proposing a Frequency-Aware Cross-Entropy (FACE) loss function that substantially improves diversity in benchmark datasets.

Sequence-to-Sequence (Seq2Seq) models have achieved encouraging performance on the dialogue response generation task. However, existing Seq2Seq-based response generation methods suffer from a low-diversity problem: they frequently generate generic responses, which make the conversation less interesting. In this paper, we address the low-diversity problem by investigating its connection with model over-confidence reflected in predicted distributions. Specifically, we first analyze the influence of the commonly used Cross-Entropy (CE) loss function, and find that the CE loss function prefers high-frequency tokens, which results in low-diversity responses. We then propose a Frequency-Aware Cross-Entropy (FACE) loss function that improves over the CE loss function by incorporating a weighting mechanism conditioned on token frequency. Extensive experiments on benchmark datasets show that the FACE loss function is able to substantially improve the diversity of existing state-of-the-art Seq2Seq response generation methods, in terms of both automatic and human evaluations.

View on arXiv PDF

Similar