Unveiling Environmental Impacts of Large Language Model Serving: A Functional Unit View
This work addresses the environmental impact problem for AI practitioners and researchers by providing a standardized evaluation method, though it is incremental as it builds on existing benchmarking studies.
The authors tackled the lack of standardized comparison for carbon emissions in large language model (LLM) serving by introducing a functional unit (FU) framework called FUEL, which revealed trade-offs in reducing emissions through optimizations like model size and hardware choice.
Large language models (LLMs) offer powerful capabilities but come with significant environmental impact, particularly in carbon emissions. Existing studies benchmark carbon emissions but lack a standardized basis for comparison across different model configurations. To address this, we introduce the concept of functional unit (FU) as a standardized basis and develop FUEL, the first FU-based framework for evaluating LLM serving's environmental impact. Through three case studies, we uncover key insights and trade-offs in reducing carbon emissions by optimizing model size, quantization strategy, and hardware choice, paving the way for more sustainable LLM serving. The code is available at https://github.com/jojacola/FUEL.