Gradient Descent in RKHS with Importance Labeling
This work addresses the labeling bottleneck in supervised learning for researchers and practitioners, offering an incremental improvement over existing sampling methods.
The paper tackles the problem of expensive labeling costs in supervised learning by proposing a new importance labeling scheme for selecting informative subsets of unlabeled data in least squares regression within RKHS, showing that gradient descent with this scheme achieves optimal convergence rates and better generalization, especially with small label noise, compared to uniform sampling.
Labeling cost is often expensive and is a fundamental limitation of supervised learning. In this paper, we study importance labeling problem, in which we are given many unlabeled data and select a limited number of data to be labeled from the unlabeled data, and then a learning algorithm is executed on the selected one. We propose a new importance labeling scheme that can effectively select an informative subset of unlabeled data in least squares regression in Reproducing Kernel Hilbert Spaces (RKHS). We analyze the generalization error of gradient descent combined with our labeling scheme and show that the proposed algorithm achieves the optimal rate of convergence in much wider settings and especially gives much better generalization ability in a small label noise setting than the usual uniform sampling scheme. Numerical experiments verify our theoretical findings.