CLOct 11, 2022

Measuring and Improving Semantic Diversity of Dialogue Generation

Stanford

arXiv:2210.05725v224.3296 citationsh-index: 17Has Code

Originality Incremental advance

AI Analysis

This addresses the issue of limited semantic diversity in open-domain dialogue systems, which is incremental as it builds on existing evaluation and training approaches.

The paper tackled the problem of evaluating and improving semantic diversity in dialogue generation, introducing a new metric that better aligns with human judgments and a training method that enhances both diversity and coherency.

Response diversity has become an important criterion for evaluating the quality of open-domain dialogue generation models. However, current evaluation metrics for response diversity often fail to capture the semantic diversity of generated responses, as they mainly consider lexical aspects of the generated responses. In this paper, we introduce a new automatic evaluation metric to measure the semantic diversity of generated responses. Through human evaluation, we demonstrate that our proposed metric captures human judgments on response diversity better than existing lexical-level diversity metrics. Furthermore, motivated by analyzing an existing dialogue dataset, we propose a simple yet effective learning method that improves the semantic diversity of generated responses. Our learning method weights training samples based on the semantic distribution of the training set. We show that our learning method improves response diversity and coherency better than other baseline methods through automatic and human evaluation.

View on arXiv PDF Code

Similar