LGNov 27, 2022

SteppingNet: A Stepping Neural Network with Incremental Accuracy Enhancement

Wenhao Sun, Grace Li Zhang, Xunzhao Yin, Cheng Zhuo, Huaxi Gu, Bing Li, Ulf Schlichtmann

arXiv:2211.14926v13.33 citationsh-index: 32

Originality Incremental advance

AI Analysis

This addresses the problem of dynamic accuracy-latency trade-offs for applications on mobile and autonomous systems, representing an incremental improvement in neural network design.

The paper tackles the challenge of deploying deep neural networks on resource-constrained platforms by proposing SteppingNet, a framework that incrementally enhances accuracy as more computational resources become available, achieving consistent state-of-the-art performance under the same resource limits.

Deep neural networks (DNNs) have successfully been applied in many fields in the past decades. However, the increasing number of multiply-and-accumulate (MAC) operations in DNNs prevents their application in resource-constrained and resource-varying platforms, e.g., mobile phones and autonomous vehicles. In such platforms, neural networks need to provide acceptable results quickly and the accuracy of the results should be able to be enhanced dynamically according to the computational resources available in the computing system. To address these challenges, we propose a design framework called SteppingNet. SteppingNet constructs a series of subnets whose accuracy is incrementally enhanced as more MAC operations become available. Therefore, this design allows a trade-off between accuracy and latency. In addition, the larger subnets in SteppingNet are built upon smaller subnets, so that the results of the latter can directly be reused in the former without recomputation. This property allows SteppingNet to decide on-the-fly whether to enhance the inference accuracy by executing further MAC operations. Experimental results demonstrate that SteppingNet provides an effective incremental accuracy improvement and its inference accuracy consistently outperforms the state-of-the-art work under the same limit of computational resources.

View on arXiv PDF

Similar