Relative Entropy Regularized Reinforcement Learning for Efficient Encrypted Policy Synthesis
This work addresses privacy concerns in reinforcement learning for applications where data confidentiality is critical, though it appears to be an incremental improvement combining existing techniques.
The authors tackled the problem of privacy-preserving reinforcement learning by developing an encrypted policy synthesis method using relative-entropy-regularized RL, which enables efficient integration with fully homomorphic encryption. Numerical simulations validated the approach's effectiveness in maintaining privacy while synthesizing policies.
We propose an efficient encrypted policy synthesis to develop privacy-preserving model-based reinforcement learning. We first demonstrate that the relative-entropy-regularized reinforcement learning framework offers a computationally convenient linear and ``min-free'' structure for value iteration, enabling a direct and efficient integration of fully homomorphic encryption with bootstrapping into policy synthesis. Convergence and error bounds are analyzed as encrypted policy synthesis propagates errors under the presence of encryption-induced errors including quantization and bootstrapping. Theoretical analysis is validated by numerical simulations. Results demonstrate the effectiveness of the RERL framework in integrating FHE for encrypted policy synthesis.