DC AIMay 15

Designing Datacenter Power Delivery Hierarchies for the AI Era

Grant Wilkins, Fiodar Kazhamiaka, Alok Gautam Kumbhare, Chaojie Zhang, Ricardo Bianchini

arXiv:2605.1625567.8

AI Analysis

For datacenter designers and operators, this work provides a framework to evaluate power delivery designs under realistic conditions, highlighting the importance of considering multi-resource stranding and long-term deployable capacity.

The paper addresses the challenge of designing datacenter power delivery hierarchies for high-density AI workloads, projecting up to 1MW per rack by 2027. Using a framework that combines projection models with operational data from Microsoft Azure, the authors show that multi-resource stranding significantly impacts deployable capacity, effective capital expenditure, and delivered performance, and that the relevant planning objective is deployable capacity over time rather than installed megawatts.

Demand for AI accelerators is rapidly increasing rack power density, with projections approaching 1MW per deployment by 2027. This poses a major challenge for datacenter power delivery designers. As power densities increase, a datacenter designed for a different target density may strand power, i.e., may be unable to use all the power that its delivery hierarchy has provisioned. Designs must remain efficient over long datacenter lifetimes and multiple hardware generations. Power utilization is particularly important as grid power capacity is a scarce resource in the AI era. Designing an efficient power delivery hierarchy for the long run is difficult because rack placement feasibility, workload impact, and cost depend jointly on electrical topology, deployment granularity, placement policy, power oversubscription, and workload mix. Moreover, each of these factors evolve over time, have inter-dependencies across multiple resource dimensions, and generally do not lend themselves to closed-form analysis. To address this challenge, we develop a framework for evaluating datacenter power delivery designs using throughput, power, and cost metrics over realistic arrival, oversubscription, and decommissioning sequences. The framework combines projection models for GPU, compute, and storage deployments with operational factors grounded in production data from Microsoft Azure. Our results show that multi-resource stranding materially changes deployable capacity, effective capital expenditure, and delivered performance, and quantify how rising density from rack- and pod-scale AI systems shapes these outcomes. For AI datacenter design, the relevant planning objective is not installed megawatts, but deployable capacity over time.

View on arXiv PDF

Similar