Variational Quantum Reinforcement Learning via Evolutionary Optimization
This work addresses the problem of scaling quantum reinforcement learning for researchers in quantum computing and AI, though it is incremental as it builds on existing quantum and classical methods.
The paper tackles the challenge of performing reinforcement learning on quantum computers despite limited qubits by introducing two frameworks: amplitude encoding for the Cart-Pole problem and a hybrid TN-VQC architecture for high-dimensional inputs like the MiniGrid environment with 147 dimensions, demonstrating quantum advantage in parameter saving and enabling applications on noisy quantum devices.
Recent advance in classical reinforcement learning (RL) and quantum computation (QC) points to a promising direction of performing RL on a quantum computer. However, potential applications in quantum RL are limited by the number of qubits available in the modern quantum devices. Here we present two frameworks of deep quantum RL tasks using a gradient-free evolution optimization: First, we apply the amplitude encoding scheme to the Cart-Pole problem; Second, we propose a hybrid framework where the quantum RL agents are equipped with hybrid tensor network-variational quantum circuit (TN-VQC) architecture to handle inputs with dimensions exceeding the number of qubits. This allows us to perform quantum RL on the MiniGrid environment with 147-dimensional inputs. We demonstrate the quantum advantage of parameter saving using the amplitude encoding. The hybrid TN-VQC architecture provides a natural way to perform efficient compression of the input dimension, enabling further quantum RL applications on noisy intermediate-scale quantum devices.