A Practical Introduction to Kernel Discrepancies: MMD, HSIC & KSD
This is an incremental tutorial for practitioners in machine learning and statistics, focusing on practical implementation rather than new theoretical advances.
The paper introduces kernel discrepancies (MMD, HSIC, KSD) and their estimators, including V-statistics, U-statistics, and incomplete U-statistics, while emphasizing kernel bandwidth selection and proposing adaptive estimators to address kernel selection issues.
This article provides a practical introduction to kernel discrepancies, focusing on the Maximum Mean Discrepancy (MMD), the Hilbert-Schmidt Independence Criterion (HSIC), and the Kernel Stein Discrepancy (KSD). Various estimators for these discrepancies are presented, including the commonly-used V-statistics and U-statistics, as well as several forms of the more computationally-efficient incomplete U-statistics. The importance of the choice of kernel bandwidth is stressed, showing how it affects the behaviour of the discrepancy estimation. Adaptive estimators are introduced, which combine multiple estimators with various kernels, addressing the problem of kernel selection.