LG AI CR DSDec 4, 2023

Hot PATE: Private Aggregation of Distributions for Diverse Task

Edith Cohen, Benjamin Cohen-Wang, Xin Lyu, Jelani Nelson, Tamas Sarlos, Uri Stemmer

arXiv:2312.02132v45.35 citationsh-index: 26

Originality Highly original

AI Analysis

This addresses privacy-preserving machine learning for diverse tasks like text generation, offering a drop-in solution for existing systems.

The paper tackles the tension between output diversity and privacy in the PATE framework for generative tasks, proposing Hot PATE to preserve diversity without extra privacy cost, with empirical results showing significant improvements in privacy-utility trade-offs.

The Private Aggregation of Teacher Ensembles (PATE) framework enables privacy-preserving machine learning by aggregating responses from disjoint subsets of sensitive data. Adaptations of PATE to tasks with inherent output diversity such as text generation, where the desired output is a sample from a distribution, face a core tension: as diversity increases, samples from different teachers are less likely to agree, but lower agreement results in reduced utility for the same privacy requirements. Yet suppressing diversity to artificially increase agreement is undesirable, as it distorts the output of the underlying model, and thus reduces output quality. We propose Hot PATE, a variant of PATE designed for diverse generative settings. We formalize the notion of a diversity-preserving ensemble sampler and introduce an efficient sampler that provably transfers diversity without incurring additional privacy cost. Hot PATE requires only API access to proprietary models and can be used as a drop-in replacement for existing Cold PATE samplers. Our empirical evaluations corroborate and quantify the benefits, showing significant improvements in the privacy utility trade-off on evaluated in-context learning tasks, both in preserving diversity and in returning relevant responses.

View on arXiv PDF

Similar