NEDec 16, 2016

Delta Networks for Optimized Recurrent Network Computation

arXiv:1612.05571v171 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency bottlenecks in RNNs for real-time applications like speech recognition and autonomous driving, offering incremental improvements through optimized training techniques.

The paper tackles the problem of reducing computational and memory costs in recurrent neural networks (RNNs) by proposing delta networks, where neurons transmit values only when activation changes exceed a threshold, resulting in up to 9x cost reduction on audio recognition and 100x on driving prediction with negligible accuracy loss.

Many neural networks exhibit stability in their activation patterns over time in response to inputs from sensors operating under real-world conditions. By capitalizing on this property of natural signals, we propose a Recurrent Neural Network (RNN) architecture called a delta network in which each neuron transmits its value only when the change in its activation exceeds a threshold. The execution of RNNs as delta networks is attractive because their states must be stored and fetched at every timestep, unlike in convolutional neural networks (CNNs). We show that a naive run-time delta network implementation offers modest improvements on the number of memory accesses and computes, but optimized training techniques confer higher accuracy at higher speedup. With these optimizations, we demonstrate a 9X reduction in cost with negligible loss of accuracy for the TIDIGITS audio digit recognition benchmark. Similarly, on the large Wall Street Journal speech recognition benchmark even existing networks can be greatly accelerated as delta networks, and a 5.7x improvement with negligible loss of accuracy can be obtained through training. Finally, on an end-to-end CNN trained for steering angle prediction in a driving dataset, the RNN cost can be reduced by a substantial 100X.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes