ApproxDARTS: Differentiable Neural Architecture Search with Approximate Multipliers
This work addresses energy efficiency for hardware-aware deep learning applications, but it is incremental as it builds upon the existing DARTS method.
The authors tackled the problem of reducing power consumption in deep neural networks by integrating approximate multipliers into neural architecture search, resulting in a 53.84% energy reduction compared to 32-bit floating-point multipliers and a 5.97% reduction compared to 8-bit fixed-point multipliers with negligible accuracy drop.
Integrating the principles of approximate computing into the design of hardware-aware deep neural networks (DNN) has led to DNNs implementations showing good output quality and highly optimized hardware parameters such as low latency or inference energy. In this work, we present ApproxDARTS, a neural architecture search (NAS) method enabling the popular differentiable neural architecture search method called DARTS to exploit approximate multipliers and thus reduce the power consumption of generated neural networks. We showed on the CIFAR-10 data set that the ApproxDARTS is able to perform a complete architecture search within less than $10$ GPU hours and produce competitive convolutional neural networks (CNN) containing approximate multipliers in convolutional layers. For example, ApproxDARTS created a CNN showing an energy consumption reduction of (a) $53.84\%$ in the arithmetic operations of the inference phase compared to the CNN utilizing the native $32$-bit floating-point multipliers and (b) $5.97\%$ compared to the CNN utilizing the exact $8$-bit fixed-point multipliers, in both cases with a negligible accuracy drop. Moreover, the ApproxDARTS is $2.3\times$ faster than a similar but evolutionary algorithm-based method called EvoApproxNAS.