Robust Gaussian Process Regression Based on Iterative Trimming
This work provides a more robust regression method for researchers and practitioners using Gaussian Processes, particularly when dealing with datasets prone to outliers, which is a common problem in real-world applications.
This paper addresses the issue of Gaussian Process (GP) regression being biased by outliers by introducing a new algorithm that iteratively trims extreme data points. The method significantly improves model accuracy for contaminated data, outperforming standard GP and Student-t likelihood variants in most test cases.
The Gaussian process (GP) regression can be severely biased when the data are contaminated by outliers. This paper presents a new robust GP regression algorithm that iteratively trims the most extreme data points. While the new algorithm retains the attractive properties of the standard GP as a nonparametric and flexible regression method, it can greatly improve the model accuracy for contaminated data even in the presence of extreme or abundant outliers. It is also easier to implement compared with previous robust GP variants that rely on approximate inference. Applied to a wide range of experiments with different contamination levels, the proposed method significantly outperforms the standard GP and the popular robust GP variant with the Student-t likelihood in most test cases. In addition, as a practical example in the astrophysical study, we show that this method can precisely determine the main-sequence ridge line in the color-magnitude diagram of star clusters.