MLLGMar 8, 2017

Unsupervised Ensemble Regression

arXiv:1703.02965v14 citations
Originality Highly original
AI Analysis

This addresses the challenge of estimating responses and detecting expert accuracy in regression without labels, offering a novel method for scenarios with uncorrelated expert deviations.

The paper tackles the problem of unsupervised ensemble regression where only expert predictions are available without labeled data, and proposes U-PCR, a principal components approach that improves accuracy over ensemble mean and median across various regression tasks.

Consider a regression problem where there is no labeled data and the only observations are the predictions $f_i(x_j)$ of $m$ experts $f_{i}$ over many samples $x_j$. With no knowledge on the accuracy of the experts, is it still possible to accurately estimate the unknown responses $y_{j}$? Can one still detect the least or most accurate experts? In this work we propose a framework to study these questions, based on the assumption that the $m$ experts have uncorrelated deviations from the optimal predictor. Assuming the first two moments of the response are known, we develop methods to detect the best and worst regressors, and derive U-PCR, a novel principal components approach for unsupervised ensemble regression. We provide theoretical support for U-PCR and illustrate its improved accuracy over the ensemble mean and median on a variety of regression problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes