LG AROct 30, 2023

PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices

Minghao Yan, Hongyi Wang, Shivaram Venkataraman

arXiv:2310.19991v28.86 citationsh-index: 6

Originality Incremental advance

AI Analysis

This addresses energy efficiency for edge device deployments, but it is incremental as it builds on prior work by focusing on hardware configurations often neglected.

The paper tackles the problem of high energy consumption during neural network inference on edge devices by proposing PolyThrottle, a method that optimizes hardware configurations using Constrained Bayesian Optimization, resulting in up to 36% energy savings for popular models.

As neural networks (NN) are deployed across diverse sectors, their energy demand correspondingly grows. While several prior works have focused on reducing energy consumption during training, the continuous operation of ML-powered systems leads to significant energy use during inference. This paper investigates how the configuration of on-device hardware-elements such as GPU, memory, and CPU frequency, often neglected in prior studies, affects energy consumption for NN inference with regular fine-tuning. We propose PolyThrottle, a solution that optimizes configurations across individual hardware components using Constrained Bayesian Optimization in an energy-conserving manner. Our empirical evaluation uncovers novel facets of the energy-performance equilibrium showing that we can save up to 36 percent of energy for popular models. We also validate that PolyThrottle can quickly converge towards near-optimal settings while satisfying application constraints.

View on arXiv PDF

Similar