MLLGMEOct 6, 2020

Splitting Gaussian Process Regression for Streaming Data

arXiv:2010.02424v112 citations
AI Analysis

This addresses the computational bottleneck for applying Gaussian processes to streaming data, though it is incremental as it builds on existing local methods.

The paper tackles the poor scaling of Gaussian processes for streaming data by proposing an algorithm that partitions the input space and fits localized Gaussian processes, achieving superior time and space complexity with linear memory complexity.

Gaussian processes offer a flexible kernel method for regression. While Gaussian processes have many useful theoretical properties and have proven practically useful, they suffer from poor scaling in the number of observations. In particular, the cubic time complexity of updating standard Gaussian process models make them generally unsuitable for application to streaming data. We propose an algorithm for sequentially partitioning the input space and fitting a localized Gaussian process to each disjoint region. The algorithm is shown to have superior time and space complexity to existing methods, and its sequential nature permits application to streaming data. The algorithm constructs a model for which the time complexity of updating is tightly bounded above by a pre-specified parameter. To the best of our knowledge, the model is the first local Gaussian process regression model to achieve linear memory complexity. Theoretical continuity properties of the model are proven. We demonstrate the efficacy of the resulting model on multi-dimensional regression tasks for streaming data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes