A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation
This work addresses the need for diverse and meaningful text generation in NLP applications, offering an incremental improvement over existing decoding methods.
The paper tackles the problem of generating diverse outputs in conditional generation tasks by proposing Composition Sampling, which first samples an entity chain composition and then uses beam search to generate text, resulting in higher quality outputs compared to previous stochastic decoding strategies, as demonstrated on summarization and question generation datasets with automatic and human evaluations.
We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality compared to previous stochastic decoding strategies. It builds on recently proposed plan-based neural generation models (Narayan et al, 2021) that are trained to first create a composition of the output and then generate by conditioning on it and the input. Our approach avoids text degeneration by first sampling a composition in the form of an entity chain and then using beam search to generate the best possible text grounded to this entity chain. Experiments on summarization (CNN/DailyMail and XSum) and question generation (SQuAD), using existing and newly proposed automatic metrics together with human-based evaluation, demonstrate that Composition Sampling is currently the best available decoding strategy for generating diverse meaningful outputs.