A binary-activation, multi-level weight RNN and training algorithm for ADC-/DAC-free and noise-resilient processing-in-memory inference with eNVM
This work addresses the challenge of ADC-/DAC-free and noise-resilient PIM inference for hardware accelerators, particularly in applications like trigger-word detection, though it appears incremental as it builds on existing binary activation and multi-level weight techniques.
The paper tackled the problem of enabling efficient processing-in-memory (PIM) inference with embedded nonvolatile memories (eNVM) by proposing a training algorithm for neural networks with binary activations and multi-level weights, achieving higher accuracy and noise resilience for recurrent networks compared to existing methods.
We propose a new algorithm for training neural networks with binary activations and multi-level weights, which enables efficient processing-in-memory circuits with embedded nonvolatile memories (eNVM). Binary activations obviate costly DACs and ADCs. Multi-level weights leverage multi-level eNVM cells. Compared to existing algorithms, our method not only works for feed-forward networks (e.g., fully-connected and convolutional), but also achieves higher accuracy and noise resilience for recurrent networks. In particular, we present an RNN-based trigger-word detection PIM accelerator, with detailed hardware noise models and circuit co-design techniques, and validate our algorithm's high inference accuracy and robustness against a variety of real hardware non-idealities.