CLApr 22, 2020

Trading Off Diversity and Quality in Natural Language Generation

arXiv:2004.10450v1832 citations
AI Analysis

This work addresses the lack of consensus on decoding algorithms for tasks like storytelling and dialogue, providing a framework for evaluation and a new algorithm, but it is incremental as it builds on existing methods like nucleus sampling.

The paper tackles the problem of balancing quality and diversity in open-ended language generation by framing decoding as a multi-objective optimization, finding that nucleus sampling outperforms other methods when prioritizing quality, with experiments confirming the 'likelihood trap' where high-likelihood sequences are low quality.

For open-ended language generation tasks such as storytelling and dialogue, choosing the right decoding algorithm is critical to controlling the tradeoff between generation quality and diversity. However, there presently exists no consensus on which decoding procedure is best or even the criteria by which to compare them. We address these issues by casting decoding as a multi-objective optimization problem aiming to simultaneously maximize both response quality and diversity. Our framework enables us to perform the first large-scale evaluation of decoding methods along the entire quality-diversity spectrum. We find that when diversity is a priority, all methods perform similarly, but when quality is viewed as more important, the recently proposed nucleus sampling (Holtzman et al. 2019) outperforms all other evaluated decoding algorithms. Our experiments also confirm the existence of the `likelihood trap', the counter-intuitive observation that high likelihood sequences are often surprisingly low quality. We leverage our findings to create and evaluate an algorithm called \emph{selective sampling} which tractably approximates globally-normalized temperature sampling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes