Runtime Configurable Deep Neural Networks for Energy-Accuracy Trade-off
This addresses the need for flexible and efficient neural network deployment in resource-constrained environments like embedded systems, though it is incremental as it builds on existing dynamic reconfiguration methods.
The paper tackles the problem of enabling energy-accuracy trade-offs in deep neural networks at runtime by introducing a dynamic configuration technique that adjusts network channels based on targets, resulting in up to 95% energy reduction with less than 1% accuracy loss and 50% storage savings compared to prior work.
We present a novel dynamic configuration technique for deep neural networks that permits step-wise energy-accuracy trade-offs during runtime. Our configuration technique adjusts the number of channels in the network dynamically depending on response time, power, and accuracy targets. To enable this dynamic configuration technique, we co-design a new training algorithm, where the network is incrementally trained such that the weights in channels trained in earlier steps are fixed. Our technique provides the flexibility of multiple networks while storing and utilizing one set of weights. We evaluate our techniques using both an ASIC-based hardware accelerator as well as a low-power embedded GPGPU and show that our approach leads to only a small or negligible loss in the final network accuracy. We analyze the performance of our proposed methodology using three well-known networks for MNIST, CIFAR-10, and SVHN datasets, and we show that we are able to achieve up to 95% energy reduction with less than 1% accuracy loss across the three benchmarks. In addition, compared to prior work on dynamic network reconfiguration, we show that our approach leads to approximately 50% savings in storage requirements, while achieving similar accuracy.