MLLGMay 24, 2021

Uncertainty quantification for distributed regression

arXiv:2105.11425v1
Originality Incremental advance
AI Analysis

This work addresses uncertainty quantification for distributed regression, offering a practical solution for large-scale data analysis, though it is incremental as it builds on existing divide-and-conquer methods.

The paper tackles the computational challenge of scaling kernel ridge regression to large datasets by proposing a fully data-driven method to quantify uncertainty for averaged estimators in divide-and-conquer approaches, providing rigorous theoretical guarantees and sup-norm consistency results.

The ever-growing size of the datasets renders well-studied learning techniques, such as Kernel Ridge Regression, inapplicable, posing a serious computational challenge. Divide-and-conquer is a common remedy, suggesting to split the dataset into disjoint partitions, obtain the local estimates and average them, it allows to scale-up an otherwise ineffective base approach. In the current study we suggest a fully data-driven approach to quantify uncertainty of the averaged estimator. Namely, we construct simultaneous element-wise confidence bands for the predictions yielded by the averaged estimator on a given deterministic prediction set. The novel approach features rigorous theoretical guaranties for a wide class of base learners with Kernel Ridge regression being a special case. As a by-product of our analysis we also obtain a sup-norm consistency result for the divide-and-conquer Kernel Ridge Regression. The simulation study supports the theoretical findings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes