LG AI CVSep 22, 2025

Joint Memory Frequency and Computing Frequency Scaling for Energy-efficient DNN Inference

Yunchu Han, Zhaojun Nan, Sheng Zhou, Zhisheng Niu

arXiv:2509.17970v34.1h-index: 8

Originality Incremental advance

AI Analysis

This work addresses energy efficiency for DNN inference on devices like mobile or IoT, but it appears incremental as it builds on existing DVFS techniques by adding memory frequency adjustment.

The paper tackles the problem of high latency and energy consumption in DNN inference on resource-constrained devices by proposing joint scaling of memory and computing frequencies, and simulation results show it reduces energy consumption.

Deep neural networks (DNNs) have been widely applied in diverse applications, but the problems of high latency and energy overhead are inevitable on resource-constrained devices. To address this challenge, most researchers focus on the dynamic voltage and frequency scaling (DVFS) technique to balance the latency and energy consumption by changing the computing frequency of processors. However, the adjustment of memory frequency is usually ignored and not fully utilized to achieve efficient DNN inference, which also plays a significant role in the inference time and energy consumption. In this paper, we first investigate the impact of joint memory frequency and computing frequency scaling on the inference time and energy consumption with a model-based and data-driven method. Then by combining with the fitting parameters of different DNN models, we give a preliminary analysis for the proposed model to see the effects of adjusting memory frequency and computing frequency simultaneously. Finally, simulation results in local inference and cooperative inference cases further validate the effectiveness of jointly scaling the memory frequency and computing frequency to reduce the energy consumption of devices.

View on arXiv PDF

Similar