Fidelity-based Probabilistic Q-learning for Control of Quantum Systems
This is an incremental improvement for quantum control applications, addressing a specific bottleneck in reinforcement learning.
The paper tackles the exploration-exploitation balance problem in Q-learning by proposing a fidelity-based probabilistic Q-learning (FPQL) approach for quantum system control, demonstrating in spin-1/2 and atomic systems that it achieves better balance, avoids local optima, and accelerates learning.
The balance between exploration and exploitation is a key problem for reinforcement learning methods, especially for Q-learning. In this paper, a fidelity-based probabilistic Q-learning (FPQL) approach is presented to naturally solve this problem and applied for learning control of quantum systems. In this approach, fidelity is adopted to help direct the learning process and the probability of each action to be selected at a certain state is updated iteratively along with the learning process, which leads to a natural exploration strategy instead of a pointed one with configured parameters. A probabilistic Q-learning (PQL) algorithm is first presented to demonstrate the basic idea of probabilistic action selection. Then the FPQL algorithm is presented for learning control of quantum systems. Two examples (a spin- 1/2 system and a lamda-type atomic system) are demonstrated to test the performance of the FPQL algorithm. The results show that FPQL algorithms attain a better balance between exploration and exploitation, and can also avoid local optimal policies and accelerate the learning process.