COLGMEMLMay 16, 2019

The Kernel Interaction Trick: Fast Bayesian Discovery of Pairwise Interactions in High Dimensions

arXiv:1905.06501v329 citations
Originality Incremental advance
AI Analysis

This addresses a fundamental problem in fields like biology and medicine by enabling efficient Bayesian interaction discovery, though it is an incremental improvement over existing methods.

The paper tackles the computational intractability of Bayesian methods for discovering pairwise interactions in high-dimensional data by introducing a Gaussian process representation that reduces runtime and memory usage to linear scaling. It demonstrates orders of magnitude faster runtime, lower error rates compared to LASSO-based methods, and improved computational scaling in high dimensions.

Discovering interaction effects on a response of interest is a fundamental problem faced in biology, medicine, economics, and many other scientific disciplines. In theory, Bayesian methods for discovering pairwise interactions enjoy many benefits such as coherent uncertainty quantification, the ability to incorporate background knowledge, and desirable shrinkage properties. In practice, however, Bayesian methods are often computationally intractable for even moderate-dimensional problems. Our key insight is that many hierarchical models of practical interest admit a particular Gaussian process (GP) representation; the GP allows us to capture the posterior with a vector of O(p) kernel hyper-parameters rather than O(p^2) interactions and main effects. With the implicit representation, we can run Markov chain Monte Carlo (MCMC) over model hyper-parameters in time and memory linear in p per iteration. We focus on sparsity-inducing models and show on datasets with a variety of covariate behaviors that our method: (1) reduces runtime by orders of magnitude over naive applications of MCMC, (2) provides lower Type I and Type II error relative to state-of-the-art LASSO-based approaches, and (3) offers improved computational scaling in high dimensions relative to existing Bayesian and LASSO-based approaches.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes