CLLGNov 17, 2023

Energy and Carbon Considerations of Fine-Tuning BERT

arXiv:2311.10267v2137 citationsh-index: 27
Originality Synthesis-oriented
AI Analysis

This addresses the environmental impact of NLP practices for researchers and practitioners, highlighting a previously understudied aspect of energy consumption.

The paper tackles the problem of quantifying energy and carbon emissions from fine-tuning BERT, which is often overlooked compared to pre-training, and finds that fine-tuning contributes significantly to NLP's environmental footprint due to its frequent use by many actors.

Despite the popularity of the `pre-train then fine-tune' paradigm in the NLP community, existing work quantifying energy costs and associated carbon emissions has largely focused on language model pre-training. Although a single pre-training run draws substantially more energy than fine-tuning, fine-tuning is performed more frequently by many more individual actors, and thus must be accounted for when considering the energy and carbon footprint of NLP. In order to better characterize the role of fine-tuning in the landscape of energy and carbon emissions in NLP, we perform a careful empirical study of the computational costs of fine-tuning across tasks, datasets, hardware infrastructure and measurement modalities. Our experimental results allow us to place fine-tuning energy and carbon costs into perspective with respect to pre-training and inference, and outline recommendations to NLP researchers and practitioners who wish to improve their fine-tuning energy efficiency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes