On Kernel Regression with Data-Dependent Kernels
This work addresses kernel selection for researchers in machine learning, but it is incremental as it extends known theoretical results to a data-dependent context.
The paper tackles the problem of kernel selection in kernel regression by considering data-dependent kernels that can be updated after seeing training data, showing that the optimal kernel in this setting is based on the posterior of the target function.
The primary hyperparameter in kernel regression (KR) is the choice of kernel. In most theoretical studies of KR, one assumes the kernel is fixed before seeing the training data. Under this assumption, it is known that the optimal kernel is equal to the prior covariance of the target function. In this note, we consider KR in which the kernel may be updated after seeing the training data. We point out that an analogous choice of kernel using the posterior of the target function is optimal in this setting. Connections to the view of deep neural networks as data-dependent kernel learners are discussed.