EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture Search
This addresses the problem of hardware-aware neural architecture search for developers and researchers needing efficient models on diverse platforms, including Edge GPUs, TPUs, Mobile CPUs, and custom accelerators, representing a novel method for a known bottleneck.
The paper tackles the challenge of computing gradients for hardware metrics in Differentiable Neural Architecture Search (DNAS) by proposing EH-DNAS, which integrates end-to-end hardware benchmarking and automated DNAS to deliver hardware-efficient neural networks, improving hardware performance by an average of 1.4x on customized accelerators and 1.6x on existing processors while maintaining accuracy.
In hardware-aware Differentiable Neural Architecture Search (DNAS), it is challenging to compute gradients of hardware metrics to perform architecture search. Existing works rely on linear approximations with limited support to customized hardware accelerators. In this work, we propose End-to-end Hardware-aware DNAS (EH-DNAS), a seamless integration of end-to-end hardware benchmarking, and fully automated DNAS to deliver hardware-efficient deep neural networks on various platforms, including Edge GPUs, Edge TPUs, Mobile CPUs, and customized accelerators. Given a desired hardware platform, we propose to learn a differentiable model predicting the end-to-end hardware performance of neural network architectures for DNAS. We also introduce E2E-Perf, an end-to-end hardware benchmarking tool for customized accelerators. Experiments on CIFAR10 and ImageNet show that EH-DNAS improves the hardware performance by an average of $1.4\times$ on customized accelerators and $1.6\times$ on existing hardware processors while maintaining the classification accuracy.