QUANT-PH LGMar 20, 2022

Policy Gradients using Variational Quantum Circuits

André Sequeira, Luis Paulo Santos, Luís Soares Barbosa

arXiv:2203.10591v312.229 citationsh-index: 7

Originality Incremental advance

AI Analysis

This work addresses the problem of improving Reinforcement Learning efficiency for researchers in quantum machine learning, though it appears incremental as it builds on existing quantum circuit methods.

The paper tackled the application of Variational Quantum Circuits to Reinforcement Learning by using them as parameterized policies, showing that an ε-approximation of the policy gradient can be obtained with a logarithmic sample complexity relative to parameters. Empirically, these quantum models performed similarly or better than classical neural networks in benchmarks and quantum control, using fewer parameters.

Variational Quantum Circuits are being used as versatile Quantum Machine Learning models. Some empirical results exhibit an advantage in supervised and generative learning tasks. However, when applied to Reinforcement Learning, less is known. In this work, we considered a Variational Quantum Circuit composed of a low-depth hardware-efficient ansatz as the parameterized policy of a Reinforcement Learning agent. We show that an $ε$-approximation of the policy gradient can be obtained using a logarithmic number of samples concerning the total number of parameters. We empirically verify that such quantum models behave similarly or even outperform typical classical neural networks used in standard benchmarking environments and in quantum control, using only a fraction of the parameters. Moreover, we study the Barren Plateau phenomenon in quantum policy gradients using the Fisher Information Matrix spectrum.

View on arXiv PDF

Similar