CLLGMEMLJul 26, 2024

Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation

arXiv:2407.18698v235 citationsh-index: 8
AI Analysis

This addresses the problem of improving text generation quality for users of language models, though it appears incremental as an extension of existing contrastive search methods.

The paper tackles the challenge of decoding from large language models to produce high-quality text by introducing adaptive contrastive search, a decoding strategy that incorporates an adaptive degeneration penalty guided by model uncertainty, resulting in enhanced creativity, diversity, coherence, and quality across various models and datasets.

Decoding from the output distributions of large language models to produce high-quality text is a complex challenge in language modeling. Various approaches, such as beam search, sampling with temperature, $k-$sampling, nucleus $p-$sampling, typical decoding, contrastive decoding, and contrastive search, have been proposed to address this problem, aiming to improve coherence, diversity, as well as resemblance to human-generated text. In this study, we introduce adaptive contrastive search, a novel decoding strategy extending contrastive search by incorporating an adaptive degeneration penalty, guided by the estimated uncertainty of the model at each generation step. This strategy is designed to enhance both the creativity and diversity of the language modeling process while at the same time producing coherent and high-quality generated text output. Our findings indicate performance enhancement in both aspects, across different model architectures and datasets, underscoring the effectiveness of our method in text generation tasks. Our code base, datasets, and models are publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes