Kernel Mean Shrinkage Estimators
This work addresses a fundamental issue in kernel-based machine learning, offering improved estimators for researchers and practitioners dealing with probability distribution embeddings.
The paper tackles the problem of estimating the kernel mean in reproducing kernel Hilbert spaces, which is central to many kernel methods, by proposing kernel mean shrinkage estimators (KMSEs) that outperform the standard empirical average, particularly in high-dimensional, small-sample scenarios.
A mean function in a reproducing kernel Hilbert space (RKHS), or a kernel mean, is central to kernel methods in that it is used by many classical algorithms such as kernel principal component analysis, and it also forms the core inference step of modern kernel methods that rely on embedding probability distributions in RKHSs. Given a finite sample, an empirical average has been used commonly as a standard estimator of the true kernel mean. Despite a widespread use of this estimator, we show that it can be improved thanks to the well-known Stein phenomenon. We propose a new family of estimators called kernel mean shrinkage estimators (KMSEs), which benefit from both theoretical justifications and good empirical performance. The results demonstrate that the proposed estimators outperform the standard one, especially in a "large d, small n" paradigm.