LGSep 17, 2022

On PAC Learning Halfspaces in Non-interactive Local Privacy Model with Public Unlabeled Data

arXiv:2209.08319v12 citationsh-index: 15
Originality Incremental advance
AI Analysis

This addresses privacy-preserving machine learning for data analysts, offering incremental improvements in sample efficiency for a specific setting.

The paper tackles PAC learning halfspaces in the non-interactive local differential privacy model by leveraging public unlabeled data, achieving sample complexities linear in dimension and polynomial in other terms, significantly improving prior results.

In this paper, we study the problem of PAC learning halfspaces in the non-interactive local differential privacy model (NLDP). To breach the barrier of exponential sample complexity, previous results studied a relaxed setting where the server has access to some additional public but unlabeled data. We continue in this direction. Specifically, we consider the problem under the standard setting instead of the large margin setting studied before. Under different mild assumptions on the underlying data distribution, we propose two approaches that are based on the Massart noise model and self-supervised learning and show that it is possible to achieve sample complexities that are only linear in the dimension and polynomial in other terms for both private and public data, which significantly improve the previous results. Our methods could also be used for other private PAC learning problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes