AIROJul 13, 2025

Learning to Control Dynamical Agents via Spiking Neural Networks and Metropolis-Hastings Sampling

arXiv:2507.09540v1h-index: 8Mind
Originality Incremental advance
AI Analysis

This addresses the problem of energy-efficient real-time control for robotics or embedded systems, though it appears incremental as it adapts existing sampling techniques to a specific domain.

The paper tackles the challenge of training Spiking Neural Networks (SNNs) for reinforcement learning tasks by introducing a Metropolis-Hastings sampling framework, which outperforms Deep Q-Learning and prior SNN methods on AcroBot and CartPole benchmarks in terms of reward and efficiency.

Spiking Neural Networks (SNNs) offer biologically inspired, energy-efficient alternatives to traditional Deep Neural Networks (DNNs) for real-time control systems. However, their training presents several challenges, particularly for reinforcement learning (RL) tasks, due to the non-differentiable nature of spike-based communication. In this work, we introduce what is, to our knowledge, the first framework that employs Metropolis-Hastings (MH) sampling, a Bayesian inference technique, to train SNNs for dynamical agent control in RL environments without relying on gradient-based methods. Our approach iteratively proposes and probabilistically accepts network parameter updates based on accumulated reward signals, effectively circumventing the limitations of backpropagation while enabling direct optimization on neuromorphic platforms. We evaluated this framework on two standard control benchmarks: AcroBot and CartPole. The results demonstrate that our MH-based approach outperforms conventional Deep Q-Learning (DQL) baselines and prior SNN-based RL approaches in terms of maximizing the accumulated reward while minimizing network resources and training episodes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes