COApr 4, 2022
Scalable Spike-and-SlabNiloy Biswas, Lester Mackey, Xiao-Li Meng
Spike-and-slab priors are commonly used for Bayesian variable selection, due to their interpretability and favorable statistical properties. However, existing samplers for spike-and-slab posteriors incur prohibitive computational costs when the number of variables is large. In this article, we propose Scalable Spike-and-Slab ($S^3$), a scalable Gibbs sampling implementation for high-dimensional Bayesian regression with the continuous spike-and-slab prior of George and McCulloch (1993). For a dataset with $n$ observations and $p$ covariates, $S^3$ has order $\max\{ n^2 p_t, np \}$ computational cost at iteration $t$ where $p_t$ never exceeds the number of covariates switching spike-and-slab states between iterations $t$ and $t-1$ of the Markov chain. This improves upon the order $n^2 p$ per-iteration cost of state-of-the-art implementations as, typically, $p_t$ is substantially smaller than $p$. We apply $S^3$ on synthetic and real-world datasets, demonstrating orders of magnitude speed-ups over existing exact samplers and significant gains in inferential quality over approximate samplers with comparable cost.
CRAug 24, 2020
Congenial Differential Privacy under Mandated DisclosureRuobin Gong, Xiao-Li Meng
Differentially private data releases are often required to satisfy a set of external constraints that reflect the legal, ethical, and logical mandates to which the data curator is obligated. The enforcement of constraints, when treated as post-processing, adds an extra phase in the production of privatized data. It is well understood in the theory of multi-phase processing that congeniality, a form of procedural compatibility between phases, is a prerequisite for the end users to straightforwardly obtain statistically valid results. Congenial differential privacy is theoretically principled, which facilitates transparency and intelligibility of the mechanism that would otherwise be undermined by ad-hoc post-processing procedures. We advocate for the systematic integration of mandated disclosure into the design of the privacy mechanism via standard probabilistic conditioning on the invariant margins. Conditioning automatically renders congeniality because any extra post-processing phase becomes unnecessary. We provide both initial theoretical guarantees and a Markov chain algorithm for our proposal. We also discuss intriguing theoretical issues that arise in comparing congenital differential privacy and optimization-based post-processing, as well as directions for further research.