CL AIJul 28, 2025

Mind the Gap: Conformative Decoding to Improve Output Diversity of Instruction-Tuned Large Language Models

Max Peeperkorn, Tom Kouwenhoven, Dan Brown, Anna Jordanous

arXiv:2507.20956v16.72 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the diversity loss in instruction-tuned LLMs, which is crucial for tasks like creative writing, though it is incremental as it builds on existing decoding methods.

The paper tackles the problem of reduced output diversity in instruction-tuned large language models, showing significant decreases in diversity due to instruction-tuning, with DPO having the most substantial impact. It presents conformative decoding, a new decoding strategy that typically increases diversity and maintains or improves quality by guiding an instruct model using its more diverse base model.

Instruction-tuning large language models (LLMs) reduces the diversity of their outputs, which has implications for many tasks, particularly for creative tasks. This paper investigates the ``diversity gap'' for a writing prompt narrative generation task. This gap emerges as measured by current diversity metrics for various open-weight and open-source LLMs. The results show significant decreases in diversity due to instruction-tuning. We explore the diversity loss at each fine-tuning stage for the OLMo and OLMo 2 models to further understand how output diversity is affected. The results indicate that DPO has the most substantial impact on diversity. Motivated by these findings, we present a new decoding strategy, conformative decoding, which guides an instruct model using its more diverse base model to reintroduce output diversity. We show that conformative decoding typically increases diversity and even maintains or improves quality.

View on arXiv PDF

Similar