TableNet: a multiplier-less implementation of neural networks for inferencing
This addresses hardware efficiency for deploying neural networks in resource-constrained environments, but it is incremental as it builds on existing training methods.
The paper tackles the problem of simplifying hardware implementation for deep learning inference by replacing matrix multiply and add operations with look-up tables (LUTs) and additions, achieving a multiplier-less design with similar performance and memory footprint as full-precision networks.
We consider the use of look-up tables (LUT) to simplify the hardware implementation of a deep learning network for inferencing after weights have been successfully trained. The use of LUT replaces the matrix multiply and add operations with a small number of LUTs and addition operations resulting in a completely multiplier-less implementation. We compare the different tradeoffs of this approach in terms of accuracy versus LUT size and the number of operations and show that similar performance can be obtained with a comparable memory footprint as a full precision deep neural network, but without the use of any multipliers. We illustrate this with several architectures such as MLP and CNN.