LGDec 2, 2021

Differentially Private Exploration in Reinforcement Learning with Linear Representation

Paul Luyo, Evrard Garcelon, Alessandro Lazaric, Matteo Pirotta

arXiv:2112.01585v29.212 citationsh-index: 30

Originality Incremental advance

AI Analysis

It addresses privacy concerns in reinforcement learning for applications like healthcare or finance, offering incremental improvements with new algorithms and bounds.

This paper tackles privacy-preserving exploration in reinforcement learning with linear representation, providing regret bounds for differential privacy in both model-based and model-free settings, such as O~(K^{3/4}/√ε) for local DP and O~(√K/ε) for joint DP.

This paper studies privacy-preserving exploration in Markov Decision Processes (MDPs) with linear representation. We first consider the setting of linear-mixture MDPs (Ayoub et al., 2020) (a.k.a.\ model-based setting) and provide an unified framework for analyzing joint and local differential private (DP) exploration. Through this framework, we prove a $\widetilde{O}(K^{3/4}/\sqrtε)$ regret bound for $(ε,δ)$-local DP exploration and a $\widetilde{O}(\sqrt{K/ε})$ regret bound for $(ε,δ)$-joint DP. We further study privacy-preserving exploration in linear MDPs (Jin et al., 2020) (a.k.a.\ model-free setting) where we provide a $\widetilde{O}\left(K^{\frac{3}{5}}/ε^{\frac{2}{5}}\right)$ regret bound for $(ε,δ)$-joint DP, with a novel algorithm based on low-switching. Finally, we provide insights into the issues of designing local DP algorithms in this model-free setting.

View on arXiv PDF

Similar