CLMar 12

BLooP: Zero-Shot Abstractive Summarization using Large Language Models with Bigram Lookahead Promotion

arXiv:2603.11415v18.1h-index: 38Has Code

Predicted impact top 9% in CL · last 90 daysOriginality Incremental advance

AI Analysis

This addresses the issue of faithfulness in zero-shot abstractive summarization for users relying on LLMs, though it is incremental as it builds on existing decoding methods.

The paper tackles the problem of large language models missing key details and including extraneous information in zero-shot abstractive summarization by proposing BLooP, a training-free decoding intervention that encourages models to generate tokens forming bigrams from the source document, resulting in improvements in ROUGE and BARTScore across multiple models and datasets, with human evaluation showing significant gains in faithfulness without reducing readability.

Abstractive summarization requires models to generate summaries that convey information in the source document. While large language models can generate summaries without fine-tuning, they often miss key details and include extraneous information. We propose BLooP (Bigram Lookahead Promotion), a simple training-free decoding intervention that encourages large language models (LLMs) to generate tokens that form bigrams from the source document. BLooP operates through a hash table lookup at each decoding step, requiring no training, fine-tuning, or model modification. We demonstrate improvements in ROUGE and BARTScore for Llama-3.1-8B-Instruct, Mistral-Nemo-Instruct-2407, and Gemma-2-9b-it on CNN/DM, CCSum, Multi-News, and SciTLDR. Human evaluation shows that BLooP significantly improves faithfulness without reducing readability. We make the code available at https://github.com/varuniyer/BLooP

View on arXiv PDF Code

Similar