LG CR DSOct 24, 2020

Differentially Private Online Submodular Maximization

Sebastian Perez-Salazar, Rachel Cummings

arXiv:2010.12816v12.39 citationsh-index: 15

Originality Incremental advance

AI Analysis

This work addresses privacy-preserving online optimization for submodular functions, which is incremental as it extends existing methods to incorporate differential privacy in a sequential decision-making context.

The paper tackles the problem of online submodular maximization under a cardinality constraint with differential privacy, developing algorithms for both full-information and bandit settings that achieve expected regret bounds with concrete rates, such as O(k^2 log|U| sqrt(T log k/δ)/ε) in the full-information case.

In this work we consider the problem of online submodular maximization under a cardinality constraint with differential privacy (DP). A stream of $T$ submodular functions over a common finite ground set $U$ arrives online, and at each time-step the decision maker must choose at most $k$ elements of $U$ before observing the function. The decision maker obtains a payoff equal to the function evaluated on the chosen set, and aims to learn a sequence of sets that achieves low expected regret. In the full-information setting, we develop an $(\varepsilon,δ)$-DP algorithm with expected $(1-1/e)$-regret bound of $\mathcal{O}\left( \frac{k^2\log |U|\sqrt{T \log k/δ}}{\varepsilon} \right)$. This algorithm contains $k$ ordered experts that learn the best marginal increments for each item over the whole time horizon while maintaining privacy of the functions. In the bandit setting, we provide an $(\varepsilon,δ+ O(e^{-T^{1/3}}))$-DP algorithm with expected $(1-1/e)$-regret bound of $\mathcal{O}\left( \frac{\sqrt{\log k/δ}}{\varepsilon} (k (|U| \log |U|)^{1/3})^2 T^{2/3} \right)$. Our algorithms contains $k$ ordered experts that learn the best marginal item to select given the items chosen her predecessors, while maintaining privacy of the functions. One challenge for privacy in this setting is that the payoff and feedback of expert $i$ depends on the actions taken by her $i-1$ predecessors. This particular type of information leakage is not covered by post-processing, and new analysis is required. Our techniques for maintaining privacy with feedforward may be of independent interest.

View on arXiv PDF

Similar