LG MLNov 19, 2024

Regression for the Mean: Auto-Evaluation and Inference with Few Labels through Post-hoc Regression

arXiv:2411.12665v210.45 citationsh-index: 13ICML

Originality Incremental advance

AI Analysis

This work addresses a bottleneck in statistical inference for machine learning applications where obtaining labeled data is resource-intensive, offering incremental improvements for domain-specific tasks.

The paper tackles the problem of high variance in Prediction Powered Inference (PPI) methods when labeled data is scarce, showing that PPI++ can underperform classical inference in such cases. It introduces two new techniques using robust regressors to achieve lower variance estimators in the few-label regime.

The availability of machine learning systems that can effectively perform arbitrary tasks has led to synthetic labels from these systems being used in applications of statistical inference, such as data analysis or model evaluation. The Prediction Powered Inference (PPI) framework provides a way of leveraging both a large pool of pseudo-labelled data and a small sample with real, high-quality labels to produce a low-variance, unbiased estimate of the quantity being evaluated for. Most work on PPI considers a relatively sizable set of labelled samples, which can be resource intensive to obtain. However, we find that when labelled data is scarce, the PPI++ method can perform even worse than classical inference. We analyze this phenomenon by relating PPI++ to ordinary least squares regression, which also experiences high variance with small sample sizes, and use this regression framework to better understand the efficacy of PPI. Motivated by this, we present two new PPI-based techniques that leverage robust regressors to produce even lower variance estimators in the few-label regime.

View on arXiv PDF

Similar