QUANT-PHLGDec 13, 2022

Quantum Policy Gradient Algorithm with Optimized Action Decoding

arXiv:2212.06663v231 citationsh-index: 36
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in quantum reinforcement learning for researchers in quantum machine learning, with potential broader applications in VQC-based algorithms.

The authors tackled the problem of action decoding in quantum reinforcement learning by proposing an optimized procedure for quantum policy gradient algorithms, resulting in significant performance improvements in benchmark environments and successful execution on a 5-qubit hardware device.

Quantum machine learning implemented by variational quantum circuits (VQCs) is considered a promising concept for the noisy intermediate-scale quantum computing era. Focusing on applications in quantum reinforcement learning, we propose a specific action decoding procedure for a quantum policy gradient approach. We introduce a novel quality measure that enables us to optimize the classical post-processing required for action selection, inspired by local and global quantum measurements. The resulting algorithm demonstrates a significant performance improvement in several benchmark environments. With this technique, we successfully execute a full training routine on a 5-qubit hardware device. Our method introduces only negligible classical overhead and has the potential to improve VQC-based algorithms beyond the field of quantum reinforcement learning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes