Asking a Language Model for Diverse Responses
This work addresses the need for more varied outputs in language model applications, though it is incremental as it builds on existing sampling methods.
The paper tackled the problem of generating diverse responses from language models by comparing ancestral sampling with enumeration and iterative sampling strategies, finding that the latter two achieve higher diversity at comparable quality under matched budgets.
Large language models increasingly rely on explicit reasoning chains and can produce multiple plausible responses for a given context. We study the candidate sampler that produces the set of plausible responses contrasting the ancestral (parallel) sampling against two alternatives: enumeration, which asks the model to produce $n$ candidates in one pass, and iterative sampling, which proposes candidates sequentially while conditioning on the currently generated response set. Under matched budgets, we compare these samplers on quality, lexical and computation flow diversity, and efficiency. Our empirical results demonstrate that enumeration and iterative strategies result in higher diversity at comparable quality. Our findings highlight the potential of simple non-independent sampling strategies to improve response diversity without sacrificing generation quality.