DiffSampling: Enhancing Diversity and Accuracy in Neural Text Generation
This addresses the issue of low diversity and accuracy in neural text generation for users of language models, but it is incremental as it builds upon existing decoding strategies.
The paper tackles the problem of language models generating repetitive and unoriginal text by proposing DiffSampling, a decoding method that uses token probability distributions to enhance diversity and accuracy, showing it performs at least as well as existing methods in quality across four text-generation tasks.
Despite their growing capabilities, language models still frequently reproduce content from their training data, generate repetitive text, and favor common grammatical patterns and vocabulary. A possible cause is the decoding strategy: the most common strategies either consider only the most probable tokens, which reduces output diversity, or increase the likelihood of unlikely tokens, compromising output accuracy and correctness. In this paper, we propose DiffSampling, a new decoding method that leverages a mathematical analysis of the token probability distribution to ensure the generation of contextually appropriate text. In particular, the difference between consecutive, sorted probabilities can be used to truncate incorrect tokens. In addition, we also propose two variations of the proposed method that aim to correct the subtle inconsistencies of common sampling strategies. Experiments involving four different text-generation tasks demonstrate that our approach consistently performs at least on par with the existing methods it builds upon in terms of quality, while potentially improving output diversity.