EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling
This addresses a specific decoding issue for LLM users, offering an incremental improvement over fixed temperature sampling.
The paper tackles the problem of balancing generation quality and diversity in Large Language Models by proposing an Entropy-based Dynamic Temperature (EDT) Sampling method, which dynamically selects temperature parameters and significantly outperforms existing strategies across four different generation benchmarks.
Recently, Large Language Models (LLMs) have demonstrated outstanding performance across a wide range of downstream language tasks. Temperature sampling is a commonly used decoding strategy for LLMs' generation process. However, a fixed temperature parameter is used in most cases, which may not always be an optimal choice for balancing generation quality and diversity. In this paper, we propose an effective Entropy-based Dynamic Temperature (EDT) Sampling method, to achieve a more balanced performance in terms of both generation quality and diversity by dynamically selecting the temperature parameter. Additionally, we also show model performance and comprehensive analyses for 4 different generation benchmarks. Our experiments show that EDT significantly outperforms the existing strategies across different tasks.