Improving Diversity in Language Models: When Temperature Fails, Change the Loss
This work addresses the challenge of enhancing diversity in language models for applications requiring robust text generation, though it appears incremental as it builds on existing methods like temperature scaling.
The paper tackled the problem of improving diversity in language models by analyzing why temperature scaling often fails to boost coverage, and proposed rethinking loss functions using a Precision-Recall framework to achieve a better trade-off between precision and recall.
Increasing diversity in language models is a challenging yet essential objective. A common approach is to raise the decoding temperature. In this work, we investigate this approach through a simplistic yet common case to provide insights into why decreasing temperature can improve quality (Precision), while increasing it often fails to boost coverage (Recall). Our analysis reveals that for a model to be effectively tunable through temperature adjustments, it must be trained toward coverage. To address this, we propose rethinking loss functions in language models by leveraging the Precision-Recall framework. Our results demonstrate that this approach achieves a substantially better trade-off between Precision and Recall than merely combining negative log-likelihood training with temperature scaling. These findings offer a pathway toward more versatile and robust language modeling techniques.