LGCLMLJul 3, 2020

On the Relation between Quality-Diversity Evaluation and Distribution-Fitting Goal in Text Generation

arXiv:2007.01488v27 citations
AI Analysis

This work addresses a theoretical gap in evaluating text generation models for researchers and practitioners, but it is incremental as it builds on existing quality-diversity evaluation frameworks.

The paper tackles the problem of evaluating text generation models by investigating the relationship between quality-diversity metrics and the distribution-fitting goal, proving that under certain conditions, a linear combination of quality and diversity can serve as a divergence metric between generated and real distributions, and proposing CR/NRR as a substitute for BLEU/Self-BLEU.

The goal of text generation models is to fit the underlying real probability distribution of text. For performance evaluation, quality and diversity metrics are usually applied. However, it is still not clear to what extend can the quality-diversity evaluation reflect the distribution-fitting goal. In this paper, we try to reveal such relation in a theoretical approach. We prove that under certain conditions, a linear combination of quality and diversity constitutes a divergence metric between the generated distribution and the real distribution. We also show that the commonly used BLEU/Self-BLEU metric pair fails to match any divergence metric, thus propose CR/NRR as a substitute for quality/diversity metric pair.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes