ARApr 22

EnergAIzer: Fast and Accurate GPU Power Estimation Framework for AI Workloads

arXiv:2604.2010551.81 citationsh-index: 3
AI Analysis

This addresses the scalability bottleneck in power management for datacenters running AI workloads, though it appears incremental as it builds on existing power modeling techniques with a novel input prediction method.

The paper tackles the problem of GPU power estimation for AI workloads by developing EnergAIzer, a framework that predicts hardware utilization inputs analytically rather than through costly simulation or profiling, reducing estimation time from hours to seconds while achieving 8% power errors on NVIDIA Ampere GPUs and 7% error on H100 forecasting.

As AI workloads drive increases in datacenter power consumption, accurate GPU power estimation is critical for proactive power management. However, existing power models face a scalability bottleneck not in the modeling techniques themselves, but in obtaining the hardware utilization inputs they require. Conventional approaches rely on either costly simulation or hardware profiling, which makes them impractical when rapid predictions are required. This work presents EnergAIzer, which addresses this scalability bottleneck by developing a lightweight solution to predict utilization inputs, reducing the estimation walltime from hours to seconds. Our key insight is that kernels in AI workloads commonly employ optimizations that create structured patterns, which analytically determine memory traffic and execution timeline. We construct a performance model using these patterns as an analytical scaffold for empirical data fitting, which also naturally exposes module-level utilization. This predicted utilization is then fed into our power model to estimate dynamic power consumption. EnergAIzer achieves 8% power errors on NVIDIA Ampere GPUs, competitive with traditional power models with elaborate cycle-level simulation or hardware profiling. We demonstrate EnergAIzer's exploration capabilities for frequency scaling and architectural configurations, including forecasting the power of NVIDIA H100 with just 7% error. In summary, EnergAIzer provides fast and accurate power prediction for AI workloads, paving the way for power-aware design explorations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes