Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training
It addresses a hardware bottleneck for on-device training, making zeroth-order optimization feasible for resource-constrained platforms.
The paper tackles the challenge of zeroth-order optimization requiring many Gaussian random numbers, which is infeasible for hardware like FPGAs and ASICs, by proposing PeZO, a framework that reduces LUTs and FFs by 48.6% and 12.7% and saves up to 86% power without compromising training performance.
Zeroth-order (ZO) optimization is an emerging deep neural network (DNN) training paradigm that offers computational simplicity and memory savings. However, this seemingly promising approach faces a significant and long-ignored challenge. ZO requires generating a substantial number of Gaussian random numbers, which poses significant difficulties and even makes it infeasible for hardware platforms, such as FPGAs and ASICs. In this paper, we identify this critical issue, which arises from the mismatch between algorithm and hardware designers. To address this issue, we proposed PeZO, a perturbation-efficient ZO framework. Specifically, we design random number reuse strategies to significantly reduce the demand for random number generation and introduce a hardware-friendly adaptive scaling method to replace the costly Gaussian distribution with a uniform distribution. Our experiments show that PeZO reduces the required LUTs and FFs for random number generation by 48.6\% and 12.7\%, and saves at maximum 86\% power consumption, all without compromising training performance, making ZO optimization feasible for on-device training. To the best of our knowledge, we are the first to explore the potential of on-device ZO optimization, providing valuable insights for future research.