Robust Kernel-based Distribution Regression
This work addresses distribution regression for machine learning applications by providing a more robust and flexible framework, though it appears incremental as it builds on existing kernel-based methods.
The authors tackled distribution regression with two-stage sampling by introducing a robust loss function that includes many popular losses and is not necessarily convex, improving upon least squares methods, and derived learning rates under different regularity conditions showing the scaling parameter's importance for robustness and performance.
Regularization schemes for regression have been widely studied in learning theory and inverse problems. In this paper, we study distribution regression (DR) which involves two stages of sampling, and aims at regressing from probability measures to real-valued responses over a reproducing kernel Hilbert space (RKHS). Recently, theoretical analysis on DR has been carried out via kernel ridge regression and several learning behaviors have been observed. However, the topic has not been explored and understood beyond the least square based DR. By introducing a robust loss function $l_σ$ for two-stage sampling problems, we present a novel robust distribution regression (RDR) scheme. With a windowing function $V$ and a scaling parameter $σ$ which can be appropriately chosen, $l_σ$ can include a wide range of popular used loss functions that enrich the theme of DR. Moreover, the loss $l_σ$ is not necessarily convex, hence largely improving the former regression class (least square) in the literature of DR. The learning rates under different regularity ranges of the regression function $f_ρ$ are comprehensively studied and derived via integral operator techniques. The scaling parameter $σ$ is shown to be crucial in providing robustness and satisfactory learning rates of RDR.