LGDec 2, 2021

Differentially Private Exploration in Reinforcement Learning with Linear Representation

arXiv:2112.01585v212 citations
Originality Incremental advance
AI Analysis

It addresses privacy concerns in reinforcement learning for applications like healthcare or finance, offering incremental improvements with new algorithms and bounds.

This paper tackles privacy-preserving exploration in reinforcement learning with linear representation, providing regret bounds for differential privacy in both model-based and model-free settings, such as O~(K^{3/4}/√ε) for local DP and O~(√K/ε) for joint DP.

This paper studies privacy-preserving exploration in Markov Decision Processes (MDPs) with linear representation. We first consider the setting of linear-mixture MDPs (Ayoub et al., 2020) (a.k.a.\ model-based setting) and provide an unified framework for analyzing joint and local differential private (DP) exploration. Through this framework, we prove a $\widetilde{O}(K^{3/4}/\sqrtε)$ regret bound for $(ε,δ)$-local DP exploration and a $\widetilde{O}(\sqrt{K/ε})$ regret bound for $(ε,δ)$-joint DP. We further study privacy-preserving exploration in linear MDPs (Jin et al., 2020) (a.k.a.\ model-free setting) where we provide a $\widetilde{O}\left(K^{\frac{3}{5}}/ε^{\frac{2}{5}}\right)$ regret bound for $(ε,δ)$-joint DP, with a novel algorithm based on low-switching. Finally, we provide insights into the issues of designing local DP algorithms in this model-free setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes