CLAug 27, 2021

Lingxi: A Diversity-aware Chinese Modern Poetry Generation System

arXiv:2108.12108v1223 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating poetic text with high novelty for natural language processing applications, though it is incremental as it builds on existing sampling techniques.

The authors tackled the problem of generating novel and creative Chinese modern poetry by introducing Lingxi, a system that uses a diversity-aware sampling algorithm to increase novelty, resulting in significantly higher novelty compared to traditional methods while maintaining fluency.

Poetry generation has been a difficult task in natural language processing. Unlike plain neural text generation tasks, poetry has a high requirement for novelty, since an easily-understood sentence with too many high frequency words might not be considered as poetic, while adequately ambiguous sentences with low frequency words can possibly be novel and creative. Inspired by this, we present Lingxi, a diversity-aware Chinese modern poetry generation system. We propose nucleus sampling with randomized head (NS-RH) algorithm, which randomizes the high frequency part ("head") of the predicted distribution, in order to emphasize on the "comparatively low frequency" words. The proposed algorithm can significantly increase the novelty of generated poetry compared with traditional sampling methods. The permutation of distribution is controllable by tuning the filtering parameter that determines the "head" to permutate, achieving diversity-aware sampling. We find that even when a large portion of filtered vocabulary is randomized, it can actually generate fluent poetry but with notably higher novelty. We also propose a semantic-similarity-based rejection sampling algorithm, which creates longer and more informative context on the basis of the short input poetry title while maintaining high semantic similarity to the title, alleviating the off-topic issue.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes