ASAILGSDMay 12, 2025

Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models

arXiv:2505.07615v22 citationsh-index: 14WASPAA
Originality Synthesis-oriented
AI Analysis

This addresses energy efficiency concerns for users of generative audio models, though it is incremental as it focuses on analysis rather than new methods.

The paper analyzed the energy consumption of 7 state-of-the-art text-to-audio diffusion models at inference time, finding trade-offs between audio quality and energy use, with specific Pareto-optimal solutions identified.

Text-to-audio models have recently emerged as a powerful technology for generating sound from textual descriptions. However, their high computational demands raise concerns about energy consumption and environmental impact. In this paper, we conduct an analysis of the energy usage of 7 state-of-the-art text-to-audio diffusion-based generative models, evaluating to what extent variations in generation parameters affect energy consumption at inference time. We also aim to identify an optimal balance between audio quality and energy consumption by considering Pareto-optimal solutions across all selected models. Our findings provide insights into the trade-offs between performance and environmental impact, contributing to the development of more efficient generative audio models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes