LGQUANT-PHNov 6, 2025

Quantum Boltzmann Machines for Sample-Efficient Reinforcement Learning

arXiv:2511.04856v1h-index: 3
Originality Highly original
AI Analysis

This work addresses sample efficiency and instability in continuous control for reinforcement learning practitioners, representing a novel method for a known bottleneck rather than a foundational breakthrough.

The paper tackles the problem of sample-efficient reinforcement learning in continuous-action settings by introducing Continuous Semi-Quantum Boltzmann Machines (CSQBMs), which combine quantum and classical elements to reduce qubit requirements and enable analytical gradient computation, resulting in a stable continuous Q-learning framework that overcomes instability issues.

We introduce theoretically grounded Continuous Semi-Quantum Boltzmann Machines (CSQBMs) that supports continuous-action reinforcement learning. By combining exponential-family priors over visible units with quantum Boltzmann distributions over hidden units, CSQBMs yield a hybrid quantum-classical model that reduces qubit requirements while retaining strong expressiveness. Crucially, gradients with respect to continuous variables can be computed analytically, enabling direct integration into Actor-Critic algorithms. Building on this, we propose a continuous Q-learning framework that replaces global maximization by efficient sampling from the CSQBM distribution, thereby overcoming instability issues in continuous control.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes