ML AI CR LG PRNov 15, 2023

Are Normalizing Flows the Key to Unlocking the Exponential Mechanism?

Robert A. Bridges, Vandy J. Tombs, Christopher B. Stanley

arXiv:2311.09200v42.31 citationsh-index: 3Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of efficient and accurate private machine learning for domains like healthcare, though it is incremental as it builds on existing methods without a formal privacy proof.

The paper tackles the challenge of using the Exponential Mechanism (ExpM) for private optimization on continuous sample spaces by employing Normalizing Flows (NF) to approximate sampling from its intractable density, resulting in ExpM+NF achieving accuracy nearly matching non-private SGD, outperforming DPSGD, and training faster than DPSGD implementations.

The Exponential Mechanism (ExpM), designed for private optimization, has been historically sidelined from use on continuous sample spaces, as it requires sampling from a generally intractable density, and, to a lesser extent, bounding the sensitivity of the objective function. Any differential privacy (DP) mechanism can be instantiated as ExpM, and ExpM poses an elegant solution for private machine learning (ML) that bypasses inherent inefficiencies of DPSGD. This paper seeks to operationalize ExpM for private optimization and ML by using an auxiliary Normalizing Flow (NF), an expressive deep network for density learning, to approximately sample from ExpM density. The method, ExpM+NF is an alternative to SGD methods for model training. We prove a sensitivity bound for the $\ell^2$ loss permitting ExpM use with any sampling method. To test feasibility, we present results on MIMIC-III health data comparing (non-private) SGD, DPSGD, and ExpM+NF training methods' accuracy and training time. We find that a model sampled from ExpM+NF is nearly as accurate as non-private SGD, more accurate than DPSGD, and ExpM+NF trains faster than Opacus' DPSGD implementation. Unable to provide a privacy proof for the NF approximation, we present empirical results to investigate privacy including the LiRA membership inference attack of Carlini et al. and the recent privacy auditing lower bound method of Steinke et al. Our findings suggest ExpM+NF provides more privacy than non-private SGD, but not as much as DPSGD, although many attacks are impotent against any model. Ancillary benefits of this work include pushing the SOTA of privacy and accuracy on MIMIC-III healthcare data, exhibiting the use of ExpM+NF for Bayesian inference, showing the limitations of empirical privacy auditing in practice, and providing several privacy theorems applicable to distribution learning.

View on arXiv PDF Code

Similar