PPI++: Efficient Prediction-Powered Inference
This work provides an efficient solution for researchers and practitioners needing robust statistical inference with limited labeled data, though it is incremental as it builds on existing prediction-powered inference.
The authors tackled the problem of estimation and inference using small labeled datasets and larger sets of machine-learning predictions, resulting in a method that always improves on classical intervals and is computationally lightweight.
We present PPI++: a computationally lightweight methodology for estimation and inference based on a small labeled dataset and a typically much larger dataset of machine-learning predictions. The methods automatically adapt to the quality of available predictions, yielding easy-to-compute confidence sets -- for parameters of any dimensionality -- that always improve on classical intervals using only the labeled data. PPI++ builds on prediction-powered inference (PPI), which targets the same problem setting, improving its computational and statistical efficiency. Real and synthetic experiments demonstrate the benefits of the proposed adaptations.