AIFeb 17, 2025

Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption

arXiv:2502.11723v21 citationsh-index: 9
Originality Incremental advance
AI Analysis

This addresses the problem of energy efficiency in LLM deployment for developers and researchers, providing practical insights for reducing computational costs while maintaining text quality.

This paper investigates how different text generation decoding strategies in Large Language Models affect GPU energy consumption, finding that strategy choice significantly impacts energy usage even with minimal effects on output quality, with trade-offs between quality and efficiency varying across tasks.

Decoding strategies significantly influence the quality and diversity of the generated text in Large Language Models (LLMs), yet their impact on computational resources, particularly GPU energy consumption, is insufficiently studied. This paper investigates the relationship between text generation decoding techniques and energy efficiency, focusing on the trade-off between generation quality and GPU energy usage across diverse tasks and decoding configurations. By benchmarking multiple strategies across various tasks, including Translation, Math Problem Solving, Coding, and Open-ended text generation, we reveal how selecting appropriate decoding techniques with their tuned hyperparameters affects text quality and has measurable implications for energy consumption. Our findings show that the choice of decoding strategy can greatly impact GPU energy usage, even when it has a minimal effect on output quality. Different strategies also involve trade-offs between quality and energy efficiency, and no single decoding method is best in all cases across every metric. To the best of our knowledge, this is one of the first studies to examine decoding strategies in LLMs from the perspective of energy consumption, providing useful insights for building energy-efficient applications without compromising text generation quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes