LG MLFeb 24, 2025

Distributionally Robust Active Learning for Gaussian Process Regression

Shion Takeno, Yoshito Okura, Yu Inatsu, Tatsuya Aoyama, Tomonari Tanaka, Satoshi Akahane, Hiroyuki Hanada, Noriaki Hashimoto, Taro Murayama, Hanju Lee, Shinya Kojima, Ichiro Takeuchi

arXiv:2502.16870v311.42 citationsh-index: 11ICML

Originality Incremental advance

AI Analysis

This work addresses the challenge of ensuring prediction accuracy in active learning for Gaussian process regression, which is important for applications requiring efficient data collection, though it appears incremental as it builds on distributionally robust learning concepts.

The paper tackles the problem of active learning for Gaussian process regression by proposing two methods that reduce the worst-case expected error, showing an upper bound that guarantees arbitrarily small error with finite data labels under mild conditions.

Gaussian process regression (GPR) or kernel ridge regression is a widely used and powerful tool for nonlinear prediction. Therefore, active learning (AL) for GPR, which actively collects data labels to achieve an accurate prediction with fewer data labels, is an important problem. However, existing AL methods do not theoretically guarantee prediction accuracy for target distribution. Furthermore, as discussed in the distributionally robust learning literature, specifying the target distribution is often difficult. Thus, this paper proposes two AL methods that effectively reduce the worst-case expected error for GPR, which is the worst-case expectation in target distribution candidates. We show an upper bound of the worst-case expected squared error, which suggests that the error will be arbitrarily small by a finite number of data labels under mild conditions. Finally, we demonstrate the effectiveness of the proposed methods through synthetic and real-world datasets.

View on arXiv PDF

Similar