LG CR MLFeb 15, 2023

Multi-Task Differential Privacy Under Distribution Skew

Walid Krichene, Prateek Jain, Shuang Song, Mukund Sundararajan, Abhradeep Thakurta, Li Zhang

arXiv:2302.07975v18.84 citationsh-index: 44

Originality Incremental advance

AI Analysis

This addresses privacy-preserving multi-task learning for applications like recommendation systems, where skewed data distributions can degrade utility, but it is incremental as it builds on existing differential privacy frameworks.

The paper tackles multi-task learning under user-level differential privacy with distribution skew, where tasks have varying data sizes, by proposing an adaptive algorithm that optimally allocates privacy budgets, resulting in quantifiable improvements in excess empirical risk and state-of-the-art performance on standard benchmarks.

We study the problem of multi-task learning under user-level differential privacy, in which $n$ users contribute data to $m$ tasks, each involving a subset of users. One important aspect of the problem, that can significantly impact quality, is the distribution skew among tasks. Certain tasks may have much fewer data samples than others, making them more susceptible to the noise added for privacy. It is natural to ask whether algorithms can adapt to this skew to improve the overall utility. We give a systematic analysis of the problem, by studying how to optimally allocate a user's privacy budget among tasks. We propose a generic algorithm, based on an adaptive reweighting of the empirical loss, and show that when there is task distribution skew, this gives a quantifiable improvement of excess empirical risk. Experimental studies on recommendation problems that exhibit a long tail of small tasks, demonstrate that our methods significantly improve utility, achieving the state of the art on two standard benchmarks.

View on arXiv PDF

Similar