LGAIMLDec 17, 2015

Probabilistic Programming with Gaussian Process Memoization

arXiv:1512.05665v24 citations
Originality Incremental advance
AI Analysis

This addresses the problem of making GPs more accessible and practical for researchers and practitioners in fields like machine learning and statistics, though it is incremental as it builds on existing probabilistic programming methods.

The paper tackles the difficulty of applying Gaussian Processes (GPs) to complex tasks by embedding them in probabilistic programming languages using memoization, resulting in a flexible library (gpmem) that enables applications like robust regression and Bayesian optimization with minimal code (e.g., 50-line library, <20 lines per application).

Gaussian Processes (GPs) are widely used tools in statistics, machine learning, robotics, computer vision, and scientific computation. However, despite their popularity, they can be difficult to apply; all but the simplest classification or regression applications require specification and inference over complex covariance functions that do not admit simple analytical posteriors. This paper shows how to embed Gaussian processes in any higher-order probabilistic programming language, using an idiom based on memoization, and demonstrates its utility by implementing and extending classic and state-of-the-art GP applications. The interface to Gaussian processes, called gpmem, takes an arbitrary real-valued computational process as input and returns a statistical emulator that automatically improve as the original process is invoked and its input-output behavior is recorded. The flexibility of gpmem is illustrated via three applications: (i) robust GP regression with hierarchical hyper-parameter learning, (ii) discovering symbolic expressions from time-series data by fully Bayesian structure learning over kernels generated by a stochastic grammar, and (iii) a bandit formulation of Bayesian optimization with automatic inference and action selection. All applications share a single 50-line Python library and require fewer than 20 lines of probabilistic code each.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes