Jack Jewson

ML
5papers
220citations
Novelty67%
AI Score31

5 Papers

MLJul 11, 2023
Differentially Private Statistical Inference through $β$-Divergence One Posterior Sampling

Jack Jewson, Sahra Ghalebikesabi, Chris Holmes

Differential privacy guarantees allow the results of a statistical analysis involving sensitive data to be released without compromising the privacy of any individual taking part. Achieving such guarantees generally requires the injection of noise, either directly into parameter estimates or into the estimation process. Instead of artificially introducing perturbations, sampling from Bayesian posterior distributions has been shown to be a special case of the exponential mechanism, producing consistent, and efficient private estimates without altering the data generative process. The application of current approaches has, however, been limited by their strong bounding assumptions which do not hold for basic models, such as simple linear regressors. To ameliorate this, we propose $β$D-Bayes, a posterior sampling scheme from a generalised posterior targeting the minimisation of the $β$-divergence between the model and the data generating process. This provides private estimation that is generally applicable without requiring changes to the underlying model and consistently learns the data generating parameter. We show that $β$D-Bayes produces more precise inference estimation for the same privacy guarantees, and further facilitates differentially private estimation via posterior sampling for complex classifiers and continuous regression models such as neural networks for the first time.

MLAug 24, 2021
Mitigating Statistical Bias within Differentially Private Synthetic Data

Sahra Ghalebikesabi, Harrison Wilde, Jack Jewson et al.

Increasing interest in privacy-preserving machine learning has led to new and evolved approaches for generating private synthetic data from undisclosed real data. However, mechanisms of privacy preservation can significantly reduce the utility of synthetic data, which in turn impacts downstream tasks such as learning predictive models or inference. We propose several re-weighting strategies using privatised likelihood ratios that not only mitigate statistical bias of downstream estimators but also have general applicability to differentially private generative models. Through large-scale empirical evaluation, we show that private importance weighting provides simple and effective privacy-compliant augmentation for general applications of synthetic data.

LGNov 16, 2020
Foundations of Bayesian Learning from Synthetic Data

Harrison Wilde, Jack Jewson, Sebastian Vollmer et al.

There is significant growth and interest in the use of synthetic data as an enabler for machine learning in environments where the release of real data is restricted due to privacy or availability constraints. Despite a large number of methods for synthetic data generation, there are comparatively few results on the statistical properties of models learnt on synthetic data, and fewer still for situations where a researcher wishes to augment real data with another party's synthesised data. We use a Bayesian paradigm to characterise the updating of model parameters when learning in these settings, demonstrating that caution should be taken when applying conventional learning algorithms without appropriate consideration of the synthetic data generating process and learning task. Recent results from general Bayesian updating support a novel and robust approach to Bayesian synthetic-learning founded on decision theory that outperforms standard approaches across repeated experiments on supervised learning and inference problems.

MLApr 3, 2019
Generalized Variational Inference: Three arguments for deriving new Posteriors

Jeremias Knoblauch, Jack Jewson, Theodoros Damoulas

We advocate an optimization-centric view on and introduce a novel generalization of Bayesian inference. Our inspiration is the representation of Bayes' rule as infinite-dimensional optimization problem (Csiszar, 1975; Donsker and Varadhan; 1975, Zellner; 1988). First, we use it to prove an optimality result of standard Variational Inference (VI): Under the proposed view, the standard Evidence Lower Bound (ELBO) maximizing VI posterior is preferable to alternative approximations of the Bayesian posterior. Next, we argue for generalizing standard Bayesian inference. The need for this arises in situations of severe misalignment between reality and three assumptions underlying standard Bayesian inference: (1) Well-specified priors, (2) well-specified likelihoods, (3) the availability of infinite computing power. Our generalization addresses these shortcomings with three arguments and is called the Rule of Three (RoT). We derive it axiomatically and recover existing posteriors as special cases, including the Bayesian posterior and its approximation by standard VI. In contrast, approximations based on alternative ELBO-like objectives violate the axioms. Finally, we study a special case of the RoT that we call Generalized Variational Inference (GVI). GVI posteriors are a large and tractable family of belief distributions specified by three arguments: A loss, a divergence and a variational family. GVI posteriors have appealing properties, including consistency and an interpretation as approximate ELBO. The last part of the paper explores some attractive applications of GVI in popular machine learning models, including robustness and more appropriate marginals. After deriving black box inference schemes for GVI posteriors, their predictive performance is investigated on Bayesian Neural Networks and Deep Gaussian Processes, where GVI can comprehensively improve upon existing methods.

MLJun 6, 2018
Doubly Robust Bayesian Inference for Non-Stationary Streaming Data with $β$-Divergences

Jeremias Knoblauch, Jack Jewson, Theodoros Damoulas

We present the very first robust Bayesian Online Changepoint Detection algorithm through General Bayesian Inference (GBI) with $β$-divergences. The resulting inference procedure is doubly robust for both the parameter and the changepoint (CP) posterior, with linear time and constant space complexity. We provide a construction for exponential models and demonstrate it on the Bayesian Linear Regression model. In so doing, we make two additional contributions: Firstly, we make GBI scalable using Structural Variational approximations that are exact as $β\to 0$. Secondly, we give a principled way of choosing the divergence parameter $β$ by minimizing expected predictive loss on-line. Reducing False Discovery Rates of CPs from more than 90% to 0% on real world data, this offers the state of the art.