LGSPMLApr 21, 2019

Linear Multiple Low-Rank Kernel Based Stationary Gaussian Processes Regression for Time Series

arXiv:1904.09559v135 citations
Originality Incremental advance
AI Analysis

This work addresses kernel design difficulties in Gaussian processes for time series modeling, offering incremental improvements in efficiency and accuracy for machine learning applications in time series analysis.

The paper tackles the challenge of Gaussian process kernel design and hyper-parameter optimization for time series regression by proposing a grid spectral mixture kernel that approximates stationary kernels as a linear combination of low-rank sub-kernels, with experimental results showing it outperforms competitors in prediction mean-squared-error and stability on classic datasets.

Gaussian processes (GP) for machine learning have been studied systematically over the past two decades and they are by now widely used in a number of diverse applications. However, GP kernel design and the associated hyper-parameter optimization are still hard and to a large extend open problems. In this paper, we consider the task of GP regression for time series modeling and analysis. The underlying stationary kernel can be approximated arbitrarily close by a new proposed grid spectral mixture (GSM) kernel, which turns out to be a linear combination of low-rank sub-kernels. In the case where a large number of the sub-kernels are used, either the Nyström or the random Fourier feature approximations can be adopted to deal efficiently with the computational demands. The unknown GP hyper-parameters consist of the non-negative weights of all sub-kernels as well as the noise variance; their estimation is performed via the maximum-likelihood (ML) estimation framework. Two efficient numerical optimization methods for solving the unknown hyper-parameters are derived, including a sequential majorization-minimization (MM) method and a non-linearly constrained alternating direction of multiplier method (ADMM). The MM matches perfectly with the proven low-rank property of the proposed GSM sub-kernels and turns out to be a part of efficiency, stable, and efficient solver, while the ADMM has the potential to generate better local minimum in terms of the test MSE. Experimental results, based on various classic time series data sets, corroborate that the proposed GSM kernel-based GP regression model outperforms several salient competitors of similar kind in terms of prediction mean-squared-error and numerical stability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes