Locally Private Distributed Reinforcement Learning
This work addresses privacy concerns in distributed reinforcement learning for agents operating in sensitive environments, representing a novel application of LDP in this domain.
The paper tackles the problem of reinforcement learning in distributed private environments by proposing a locally differentially private algorithm to protect local agents' models from adversarial reverse engineering, demonstrating its performance under LDP with empirical evaluation.
We study locally differentially private algorithms for reinforcement learning to obtain a robust policy that performs well across distributed private environments. Our algorithm protects the information of local agents' models from being exploited by adversarial reverse engineering. Since a local policy is strongly being affected by the individual environment, the output of the agent may release the private information unconsciously. In our proposed algorithm, local agents update the model in their environments and report noisy gradients designed to satisfy local differential privacy (LDP) that gives a rigorous local privacy guarantee. By utilizing a set of reported noisy gradients, a central aggregator updates its model and delivers it to different local agents. In our empirical evaluation, we demonstrate how our method performs well under LDP. To the best of our knowledge, this is the first work that actualizes distributed reinforcement learning under LDP. This work enables us to obtain a robust agent that performs well across distributed private environments.