Quantum Reinforcement Learning by Adaptive Non-local Observables
This work addresses a bottleneck in quantum reinforcement learning for researchers in quantum machine learning, representing an incremental but practical improvement.
The paper tackles the limitation of variational quantum circuits in quantum reinforcement learning by introducing an adaptive non-local observable paradigm that jointly optimizes circuit parameters and multi-qubit measurements. The ANO-VQC architecture outperforms baseline VQCs on multiple benchmark tasks, with ablation studies showing enhanced function space without increasing circuit depth.
Hybrid quantum-classical frameworks leverage quantum computing for machine learning; however, variational quantum circuits (VQCs) are limited by the need for local measurements. We introduce an adaptive non-local observable (ANO) paradigm within VQCs for quantum reinforcement learning (QRL), jointly optimizing circuit parameters and multi-qubit measurements. The ANO-VQC architecture serves as the function approximator in Deep Q-Network (DQN) and Asynchronous Advantage Actor-Critic (A3C) algorithms. On multiple benchmark tasks, ANO-VQC agents outperform baseline VQCs. Ablation studies reveal that adaptive measurements enhance the function space without increasing circuit depth. Our results demonstrate that adaptive multi-qubit observables can enable practical quantum advantages in reinforcement learning.