ARLGMay 12

Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations

arXiv:2605.1521666.11 citations
AI Analysis

This work enables ultra-low-power recurrent neural network inference for always-on AI applications, overcoming the previously assumed impracticality of analog recurrence.

The authors demonstrate that analog recurrent neural networks can be made practical through hardware-software co-design, achieving sub-microwatt inference for keyword spotting with a 20-fold noise suppression at each cell boundary.

Always-on AI applications, from environmental sensors to biomedical implants, require ultra-low power consumption. Analog circuits offer a path to sub-microwatt inference, yet existing analog implementations are limited to feedforward architectures: extending them to recurrent dynamics has been considered impractical due to noise accumulation through temporal feedback. We demonstrate that this barrier can be overcome through hardware-software co-design. Specifically, we identify that Bistable Memory Recurrent Units (BMRUs), a class of Recurrent Neural Networks (RNNs) with discrete-valued outputs and hysteretic dynamics, admit an ultra-low power current-mode analog implementation which we design from first principles. The resulting circuit establishes a one-to-one correspondence between each learned parameter and a circuit element. The discrete outputs suppress analog noise by at least 20-fold at each cell boundary, breaking the noise accumulation that prevents analog recurrence. We reformulate BMRUs for first-quadrant operation with fixed thresholds, enabling the direct correspondence while preserving expressivity and trainability. Transistor-level simulations in 180 nm Complementary Metal-Oxide-Semiconductor (CMOS) show near-perfect agreement between software predictions and circuit-level behavior, with the software model thereby serving as a high-fidelity simulator of the physical hardware at low computational cost. We leverage this fidelity to conduct large-scale noise immunity and power scaling analyses: the power cost of adding recurrence scales linearly with state dimension, while the feedforward layers dominating total power scale quadratically, meaning recurrence is added at linear marginal cost relative to the feedforward backbone. End-to-end keyword spotting achieves sub-microwatt inference at the RNN core.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes