SARSA(0) Reinforcement Learning over Fully Homomorphic Encryption
This work addresses privacy concerns for cloud-based control systems in applications like robotics or IoT, though it is incremental as it adapts an existing RL method to encrypted computation.
The paper tackled the problem of confidential cloud-based reinforcement learning by implementing SARSA(0) over Fully Homomorphic Encryption to protect data privacy, achieving convergence with a modified algorithm that accounts for encryption delays and demonstrating it on a pole-balancing task.
We consider a cloud-based control architecture in which the local plants outsource the control synthesis task to the cloud. In particular, we consider a cloud-based reinforcement learning (RL), where updating the value function is outsourced to the cloud. To achieve confidentiality, we implement computations over Fully Homomorphic Encryption (FHE). We use a CKKS encryption scheme and a modified SARSA(0) reinforcement learning to incorporate the encryption-induced delays. We then give a convergence result for the delayed updated rule of SARSA(0) with a blocking mechanism. We finally present a numerical demonstration via implementing on a classical pole-balancing problem.