LG AIMar 31, 2025

Noise-based reward-modulated learning

Jesús García Fernández, Nasir Ahmad, Marcel van Gerven

arXiv:2503.23972v31 citationsh-index: 40

Originality Highly original

AI Analysis

This work addresses the problem of energy-efficient and adaptive AI for neuromorphic computing, offering a novel paradigm that is incremental in its application to specific hardware constraints.

The authors tackled the challenge of enabling effective learning on neuromorphic platforms by proposing noise-based reward-modulated learning (NRL), a synaptic plasticity rule that unifies reinforcement learning and gradient-based optimization with local updates. NRL achieved performance comparable to backpropagation baselines and significantly outperformed reward-modulated Hebbian learning in multi-layer networks, demonstrating its potential for low-power adaptive systems.

The pursuit of energy-efficient and adaptive artificial intelligence (AI) has positioned neuromorphic computing as a promising alternative to conventional computing. However, achieving learning on these platforms requires techniques that prioritize local information while enabling effective credit assignment. Here, we propose noise-based reward-modulated learning (NRL), a novel synaptic plasticity rule that mathematically unifies reinforcement learning and gradient-based optimization with biologically-inspired local updates. NRL addresses the computational bottleneck of exact gradients by approximating them through stochastic neural activity, transforming the inherent noise of biological and neuromorphic substrates into a functional resource. Drawing inspiration from biological learning, our method uses reward prediction errors as its optimization target to generate increasingly advantageous behavior, and eligibility traces to facilitate retrospective credit assignment. Experimental validation on reinforcement tasks, featuring immediate and delayed rewards, shows that NRL achieves performance comparable to baselines optimized using backpropagation, although with slower convergence, while showing significantly superior performance and scalability in multi-layer networks compared to reward-modulated Hebbian learning (RMHL), the most prominent similar approach. While tested on simple architectures, the results highlight the potential of noise-driven, brain-inspired learning for low-power adaptive systems, particularly in computing substrates with locality constraints. NRL offers a theoretically grounded paradigm well-suited for the event-driven characteristics of next-generation neuromorphic AI.

View on arXiv PDF

Similar