CLAIOct 11, 2024

SimpleStrat: Diversifying Language Model Generation with Stratification

arXiv:2410.09038v210 citationsh-index: 72
Originality Incremental advance
AI Analysis

This addresses the need for diverse outputs in LLM applications, but it is incremental as it builds on existing diversity methods with a new stratification approach.

The paper tackles the problem of generating diverse responses from large language models for applications like planning and synthetic data, showing that increasing temperature reduces quality and depends on model probabilities. It proposes SimpleStrat, which partitions the space into strata for sampling, achieving a 0.05 higher recall compared to GPT-4o and a 0.36 average reduction in KL Divergence compared to Llama 3.

Generating diverse responses from large language models (LLMs) is crucial for applications such as planning/search and synthetic data generation, where diversity provides distinct answers across generations. Prior approaches rely on increasing temperature to increase diversity. However, contrary to popular belief, we show not only does this approach produce lower quality individual generations as temperature increases, but it depends on model's next-token probabilities being similar to the true distribution of answers. We propose SimpleStrat, an alternative approach that uses the language model itself to partition the space into strata. At inference, a random stratum is selected and a sample drawn from within the strata. To measure diversity, we introduce CoverageQA, a dataset of underspecified questions with multiple equally plausible answers, and assess diversity by measuring KL Divergence between the output distribution and uniform distribution over valid ground truth answers. As computing probability per response/solution for proprietary models is infeasible, we measure recall on ground truth solutions. Our evaluation show using SimpleStrat achieves higher recall by 0.05 compared to GPT-4o and 0.36 average reduction in KL Divergence compared to Llama 3.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes