LGITFeb 20, 2025

On Theoretical Limits of Learning with Label Differential Privacy

arXiv:2502.14309v21 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses the fundamental performance trade-offs in privacy-preserving machine learning for scenarios with private labels and public features, providing theoretical insights for researchers and practitioners.

The paper investigates the theoretical limits of learning with label differential privacy (DP) for classification and regression, establishing minimax convergence rates and showing that label local DP yields significantly faster risk convergence than full local DP, while label central DP offers only constant-factor improvements over full DP.

Label differential privacy (DP) is designed for learning problems involving private labels and public features. While various methods have been proposed for learning under label DP, the theoretical limits remain largely unexplored. In this paper, we investigate the fundamental limits of learning with label DP in both local and central models for both classification and regression tasks, characterized by minimax convergence rates. We establish lower bounds by converting each task into a multiple hypothesis testing problem and bounding the test error. Additionally, we develop algorithms that yield matching upper bounds. Our results demonstrate that under label local DP (LDP), the risk has a significantly faster convergence rate than that under full LDP, i.e. protecting both features and labels, indicating the advantages of relaxing the DP definition to focus solely on labels. In contrast, under the label central DP (CDP), the risk is only reduced by a constant factor compared to full DP, indicating that the relaxation of CDP only has limited benefits on the performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes