MLJan 16, 2015

Differentially Private Bayesian Optimization

Matt J. Kusner, Jacob R. Gardner, Roman Garnett, Kilian Q. Weinberger

arXiv:1501.04080v214.760 citations

Originality Incremental advance

AI Analysis

This addresses privacy risks for practitioners using sensitive data like genetic or personal information in automated machine learning tuning, representing an incremental improvement by applying differential privacy to an existing method.

The paper tackles the problem of protecting sensitive data during hyper-parameter tuning with Bayesian optimization by introducing differentially private methods, proving near-optimal results under Gaussian process assumptions and providing privacy guarantees even without them.

Bayesian optimization is a powerful tool for fine-tuning the hyper-parameters of a wide variety of machine learning models. The success of machine learning has led practitioners in diverse real-world settings to learn classifiers for practical problems. As machine learning becomes commonplace, Bayesian optimization becomes an attractive method for practitioners to automate the process of classifier hyper-parameter tuning. A key observation is that the data used for tuning models in these settings is often sensitive. Certain data such as genetic predisposition, personal email statistics, and car accident history, if not properly private, may be at risk of being inferred from Bayesian optimization outputs. To address this, we introduce methods for releasing the best hyper-parameters and classifier accuracy privately. Leveraging the strong theoretical guarantees of differential privacy and known Bayesian optimization convergence bounds, we prove that under a GP assumption these private quantities are also near-optimal. Finally, even if this assumption is not satisfied, we can use different smoothness guarantees to protect privacy.

View on arXiv PDF

Similar